DOCOMENT lESDHl 



ED 192 066 



Eh 013 0B€ 



AOTHOl 

IHSTITOTION 
SPCNS ^GENCf 
FOE CATI 
CONTRACT 
NOTE 

IDBS PHICi 
DESCBIFTCBS 



Eoruch^ lobert • Cordrayr David 5, 
An Appraisal of Educational Program Ivaluaticnsi 
Federalp State, and Local Agencies, 
Northwestern Dniv*, Svanstonr 111^ 
Department of Iducatlonp Washinqtonr D-C, 
30 Jun SO 
300*79'0467 

MF01/PC17 Plus Postage. 

Elementary Seccndary Iducationi ^Evaluation Criteria^ 
♦Evaluation Methodsi Federal Aid; *Pederal Programsi 
* Federal Begulation % Gcvernment (A .^inistrative 
Body)- Government Bolei *Program Evaluatloni School 
Districts: State Departments of Educaticn 

ABSTRACT 

This report concerns evaluation of federally 
supported educational Fcograms at the national^ state, and local 
levels^ It was undertaken in response to Section 1526 cf the 
Iducition Amendments of 197B, which requires that the Comfflisiioner of 
Education conduct a comprehensive study of evaluation practices and 
prccedures, Iwc broad sources of information were used: contemporary 
research and developinent by other researchers, and direct 
investigations by the project staffs Introductory material is 
presented in the first chapter. Chapter 2 considers the rationale, 
evidence, and opinion tearing on why evaluations are done; the 
confusion and argument engendered by general denands for evaluatloni 
and the audiences to whem evaluations are addressed- Chapter 3 
addresses the question of how evaluations are executed. Chapter ^ 
covers the organization of evaluations and the capabilities of 
evaluators, and chapter 5 considers quality of evaluations. The way 
evaluatlcn results are used is considered in chapter 6, and case 
studies en the use of evaluative information are included* Chapter 7 
covers recommendations- An extensive bibliography concludes the 
leport. Legislative and management background, and research 
itrategies are contained in the appendixes, (Author/HLf) 
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and tS Si-,? f ^^"^tunory p...!-,:... for evaluation are of tan unclear 
r^l / f "echanxsms for elarifyis,. avaluation goals are necessary. Wa 

trbraSL^^t'''" "-°?.f ^'""^ «P^="i= questions which need 

to be address^u ana specific ou' fences for results, and that evaluabUity 

r^ul^^l"?-"^ i^dartakexi ..h^cc. specification is not possible. We reco^end 

ff'^'^^B hy Department ana Congressional staff to clarify 
CcngresdJonal iniormHtion needs imd to decide when evaluation Is warranted. 

- Eyaluntion activities at LEA end SEA levels vary considerably. The capa> 
J:.::C ^^"'""-^ °^ evaluators depend h^iavUy on these local evaluation 

M^ks m recomand thm that new dmands for evaluation aimed at LEAs and 
...A- preceded by a •capabilities assessment" to determine whether tUne. 
n^i^auy, and personnel exist to sclequately execute the tasks demanded. We 
dmSrLrhigh"'' °* Md technical assistance when evaluation 

* G°°^Jvaluation designs help to prevent expensive mistakes. But they are 
We rec^.°ir?h f "ly^«^a"«\ii«i°vations are planned independent of .valuation. 
We recom..™d that pilot tests be undertaken before new proirams or program vari- 
ations are adopted and that the Introduction of new programs be staged sft^^t 
good designs can be exploited. Furth^, we recommend that higher quality 
In iaf fcr t^^if'^ especially randomized experiments, be authorized eKplicitly 
progrL'eLponenls."" variations on existing programs, and new 

^ There Is a need for independent, balanced critique of major program evalu= 

critL'uelf'' ? used in policy. We recoLend routine 

critique of major national evaluations required by law, periodic critique 
Of sam ples of evaluations generated at the local, state, and federal levels 
and statutory provisions for making statistical data available for reanalysls. 

^ We conclude that the arguments over whether evaluations are used are 
frequently uninformed, that evaluations have been used, but that their use 
of improved, we recommend regular discussion between evaluation staff 
of the Department and Congressional staff to facilitate use, and routine 
^Ucy itatSLtL""'"' --1-tlons In Congressional reports and In Department 

^Recently developed standards and guidelines on evaluation are inappropriate 
prLisaL''"f f r ' "^^^ '° screening of'evaluati^n 

are'a risonawfb ^ f^' competitive grant programs. And they 

are a reasonable basis for reaching nontechnical agreements at the federal 
level, about what can reasonably be expected in m evaluation. 
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We conclude that although restrictions on program staffers initiating 
conversations with Congressional staff may be warranted, restrictions 
on evaluation unit staff are not. Restrictions do -fiSE-^oster better 
underatandini of Congressional needs for information or better planning 
and ultUnately they probably degrade utility of evaluations. We reco^end 
that no such restrictions be placed on the unit. 

^ We conclude that the absaice of regular discussion with Congressional 
staff to plan evaluations degrades timeliness, relevance, and credibility 
Of evaluations. We recommend that the Department create formal policy on 
this, and vigorously continue recmt efforts to establish a regular dialogue. 

^ High quality evaluation designs for estimating the effects of programs 

^f?"" recommend that the Department authorize 

explicitly the use of randomized field experiments to plan and evaluate new 
programs, new variations on exiscing programs, and program components. 

We conclude that th^e is a need for Independent critique of evaluation 
reports produced at SEA, LEA, and federal levels. We recomiend periodic 
sampling of LEA and SEA reports to provide balanced independent critique 
tod we recommend formal provision In contracts and policy for independent 
critique of major federal efforts. 

We conclude that access to and Identification of evaluation reports is 
otten Inadequate. We endorse recent action to limit the clearance period 
tor reports to ten days and recomend adherence to the new limit. We also 
recoimend adherence to a more conscientious system of specifying documents. 
Identifying core reciplaits of reports, and acfaiowledging authors. 

■j We conclude that exist tag methods of tracking use of evaluation results 
^ the Department are poor. We believe that reporttog use in the Annual 
Evaluation Report is desirable, but it can be improved by making rifSSce 
more sp^lfic. We recommend that a formal system be created to track use of 
results by program managers and by the Congress. The earlier recommendations 
on dialogue between Congressional staff and evaluation unit staff should 
help to assure that more useful evaluations are mounted. 

New progrmis are often badly toplmented and we know considerably less 
than we should about the Implementation process. We recomnend formal in- 
tensive measur^ent of the degree to which activities match plans where 
measurement techniques are available. We recomnend adloinlng research on 
methods of measuring ImplOTentatlon to the Implementation process because 
measurment methods are often not available. 



4 



ERIC 



DIGEST I RECOMmNDATIONS 
AND RATIONALE 



Planning and Executing Evaluation a 

We reconmend that the Congress direct Che relevant staff of Congrn^s- 
iaguiarirt""'' avaluatipn sr of the Department 

. reach aireement about when particular evaluations are warranted 
and the senses in which each evaluation required by law Is possible. 

. clarify Congressional information needs, quality of evidence 
required, and planning cycle for each major evaluation required 
by law, ^ 

. identify specific conmlttees and groups as audiences for 
evaluation results, 

, identify the changes in program or understanding which could 
occur on the basis of alternative findings. 

fn. recommendation hinges partly on the fact that a statutory demand 

rZnr^ ^T,, -^"^ ^Ply activity from jourallstic 

reporting to full-blown field experiments dedicated to estlmaclng the effects 

innovation on children. The Involvement of multiple Interest groups is 
often necessary, but complicates matters. At worst, general dmiands to evaluate 
obscure the fact that feasibility of evaluation varies enormously and thit eJab" 
anh/ri °^ unnecessary. Periodic efforts have been made by menbers 

wf.^ ^"Sressional staff to assure that production of evaluations coincides 
with authorization cycles^ and that Congressional needs are understood. The 
process is less regular and less orderly than it ought to be. 

Statutory Provisio ns for Evaluation 

that . ^ constructing statutory provisions for evaluation 

tnac the Congress 1 

. specify exactly which questions ought to be addressed and the 
audiences to whom results should be addressed. 

. provide for formal assessment of the evaluablllty of the 
relevant program where specification is not possible. 

. provide for statistically valid field testing of proposed 
evaluation requirements where specification Is no? possible 
and in-house asseasmmt insufficient. 



Though statutes are zit aboi tine reportlni requiranents , 

references to evaluatlor are no. ;iflc. The simple requirement 

to evaluate whefther the „ m-r-t - actives of the statute Is comaon 

and vague. Hearings are c - nt . mative. Defining evaluation re- 

quirCTients in terms of the ■ should be addressed is aensible 

so long as the questlors th ■ .ke sense, answering than is feasible, 

and the answers are li jly ul. The specification of audiences, 

especially particular r mmii Congressional support agencies, should 

enhance usefulness. We recog. nat expllcitness is often not feasible 

or desirable. Consequently, „ .ggest formal investiiatlon of evaluabllity 
to clarify questions, a- lences and the ways in which results can be used, 
wir.hlii a year after ena pnt r . a danand for evaluation. 

Evaluator Capabilities 
We recoimend that 

- capabilities be assessed before new statutory evaluation 
requiranaits are directed at LEAs and SEAs to determine 
where resources are adequate to meet the danand 

. expansion of training or technical assistance when the 
dCTiands are notable and capabilities low 

. explore the feasibility and desirability of direct contracts 
programs to capitalize on LEA and SEA capabilities. 

for ^3t^^^lJ^''T^^f^^^ conclusions that no real standard 

deL f f '"^^ "^valuator" exists. Skills required of the evaluator 
^"^^^ on nature of the evaluation demand and on LEA and SEA interest 
in evaluation. The second reconmendation is based on the finding that most 
mtoorJtv S^wf assistance when the demand is high and want it. A small 
minoilty of LEAs have strong evaluation units. But these are a major resource 

and we believe that direct grant opportunities should be expanded to capitalize 
on cnes, ^ - r - 

Use of au d Authority for Better Evaluatinn Desisns 
We recommend that the Congress: 

. routinely consider pilot testing every new program, variations 
on existing programs, and program components before they are 
adopted at the national level, using high quality evaluation 
designs. 

. authorize the Secretary explicitly in each evaluation statute 
to use high quality designs, especially randomized field 
experiments, for planning and evaluating new program components, 
program variations, and new programs. 
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■ The rationale for the first reconmiendation is that higher quality 
le^ir'Tf."^"'''" feasible before the program is adopted at the national 
Aevel. Better designs can be employed and cpncluaions then are likely to 
be less amblguoua, and political-institutional constraints are likely to be 
sSLf Ir^; intorduction of new programs can be staged so that earlier 

stages are a pilot for later ones. We stress formal tests of new program 
components and new variations here because such evaluations are not a matter 
of commonpractice. We will not learn how to bring about clear, detectable 
changes without more consclentioua tests. 

oth^r^t^ recommendation stems from our conclusion, based on this and 
other research, that better designs must be used If the Congress or the 
df nof^f ""^^^ estimates of the effects of programs on children. We 

rtM^ °"te estimating those effects In all cases. The process Is compli- 
cated under the best of conditions, despite cavalier announcements that the 

becausfth^v ;rrf '"^ ^f """^ "P" °^ that it was unsuccessful 

If^l t a1 ^ advocate explicit authority in statutes for high 

^^^U ^sPSCl^lly randomized experiments to facilitate their use. 
the b^tr? ^'^Pl^^it statutory provision is essential because such designs are 

provide for r!i"^ P ^.f^ that should be recognized. The authorization should 
provide for review of the use of these designs. 

Critique and Reanalygiw Af |valuation Result s 

We recommend that in statutory requirements for evaluation of oalor 
programs, the Congress t ^J"'- 

. also require Independent, balanced, and competent critique 
of evaluation results that are material to policy decisions. 

. require critique of samples of evaluations submitted by 
LEAs and SEAs in response to legal requirements 

. require that statistical data produced by national evaluations 
be made available for reanalysis. 

!w^"J^V? ""f ^° — """" ^'^^"^^ commentary. We mean reasoned Judgments 

decisions Th""^"""' '""^ evaluation are sensible and cL Inform 

aeclslona. The main reason for the recommendation is that such criticism is 

to proneSv%S'4^ ^"entlal to enhance credibility of good evaluations, 
to properly identify poor evaluations as such, and to provide feedback to 
^rk T^I f-'^' =°^"actors, ..nd grantees about the quality of their 

^IducefbfTjA "° '^"^ competent critique of evaluation reports 

produced by LEAs and SEAs In response to law, yet many could benefit from 
Crxticlsni. ~ 
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Use of Evaluation Results 



We recomiend that the Congress: 

• direct staff of relevant coimlttees, the Department , and 

the GAP to routinely outline which institutions can reasonably 
be expected to use results of each major evaluation and how 
such results might be used, during the design stage of every 
major program evaluation. 

• specify ^actly which evaluations have been used and why they 
were used^ which have not been used and why they were not 
used J in authorizations and appropriations comiittee reports. 

• require specific information about changes resultJjig from 
evaluation, whenever the law requires SEAs to describe usee 
of evaluation. 



* explore the feasibility of direct competitive grants and 
contracts programs focused on Improving the use of results 
at the LEA and SEA level. 

The first recommendation's origins lie In the absence of any mechanism 
for planning use at the national level* Simply put, unless specific user 
groups are identiflai and some decision options laid out, evaluatloa results 
are less likely to be used. Indeed, if thme is no clear way to link the 
evaluation with decisions or considerably better understanding, one can 
argue that the evaluation shouldn^t be done at all. Specifying expectations 
will also help to make It Msler to track utilization and that in turn will 
help to infora judgments about how evaluation resources could be better allocated 
The recommdation to cite useful and useless evaluations in federal reports 
and to require SEAs and LEAs to record specific changes has the s^e objectivea^ 
understanding use better In the interest of better resource allocation. The 
suggestion to identify useless evaluation is not an Invitation to criticize 
arbitrarily. We found that some I^As and SEAs are capable and totereated in 
inventtag and testing better ways to use infomatlon. The suggestion to ex- 
pand their opportunities for doing so is based on this. 



Standards and Guidelines 

Recently develops standards and guidelines for evaluation are not ap- 
propriate for Incorporation tato law. They are sufficiently well developed 
to recoDsnend that the Congress i 



. use such guidalines to understand what can reasonably be 
expected of evaluations. 

. direct that agencies use them as a guide where appropriate 
to developing criteria for judging evaluation plans submitted 
by LEAs and SlAs, 

. elicit assistance in the interpretation of guidelines from 
Congressional support agmcles, such as GAO, that have been 
Instrmental in their construction, • 
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RECOMKENDATTONS TO THE DEPARTlffiKT 
OF EDUCATION 



Authority for Technical Discussion 



We recomafflnd that the Department; 



. authorize technical staff of evaluation units to initiate 
discussion of evaluation plans with r tlnent Coniressional 
staff, at their discretion, and refrain from direntlves which 
impede direct discussion. 

The mpetus for the recommendation is simple: Competent evaluators 
can expect to do a good job only when they have the opportunity to discuss 
Congress s information needs frequently, lestrictlons on the evaluation 
unit s uiitiatini discussion with Congressional staff of Committees that 
demand evaluation prevent the job from being done bettftr. We recognize 
that some restrictions on bureaucratic lobbying for programs are warranted, 
and that some administrative rules are necessary to keep the process orderly. 

^i^^^ opportunity to figure out what Congress can use decreases 
lu .4?° evaluations will be ttoely. relevant, and credible, and 

the likelihood that the Congress will find the results useful. Relaxing 
restrictions will not of course guarantee usefulness. 

Planning and Executing Evaluat ions 

We recommend that the Department direct principal evaluation unit staff 
to meet regularly with relevant staff of committees to- 

. negotiate agreanent about when particular evaluations are 
warranted and the senses in which each evaluation required 
by law Is possible, 

. clarify Congressional inforniation needs, quality of evidence 
required, and planning cycle for e^ch major evaluation 
undertaken by the Department, 

. identify specific audiences or groups for evaluation results. 

. Identify the changes in program or understanding which could 
occur on the basis of evaluation results. 

The rationale for this recommendation is Identical to the one offered 
for a similar recommendation made to Congress. Understanding Congrassional 
information needs is not possible without some regular discussion between 
technical evaluation staff and Congressional staff. Scarcity of evaluation 
resources requires better planning and that planning cannot be Infomad 
without dialogue among relevant staff. 
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Tests of Naw Pr ogran Componen ta. Progran Variations, and New Proerams 

hl*h "® that the Departmmt authorize acpllcltly the use of 

^ua^S» "^^I'^tlon designe. especially randomised e^^taents, m 
S^^-lf eofflponttts. nev program variations, and new 

no^at^v^'eSnS."' '^'''^ estlaatlns the effects of ln= 

debatSl.°1«ti^«?^"^f ***8h quality designs lead to far leas 
are ^« dSJ^i? . children than low quality designs. They 

are more difficult to execute, and th^ are more feasible for pilot testlna 

the e«£tTo,''°'"f P"i'-» eo^on«ts, tSn'f^r estS^L 

to«oSf^^ t ongoing prograns. Explicit authorization would aake the 
importance of good designs plain, and would provide more clear opportmity 
for competent SEAs and LMs to exploit them. opportunity 

Critique and Reanalysls of Evaluation 



We reconmecd that the Departments 

. provide for the independent, balanced, and competmt 
critique of every major evaluation funded by the 
Department In procurement of evaluations and 
evaluatlctti policy, 

. Incorporate into procurraait procedures and policy 
the requlrmfflit that all statistical data produced 
in major progrmi evaluations be docmnaited and stored 
for reanalysis. 

. create an administrative mechanism for decidtog when 
simultaneous analysis by both the original evaluator 
and an indepmdent analyst Is desirable and feasible 
and a mechanism for executing simultaneous Independmt 
analyses , 

The rationale for this recomendat Ion Is Identical t "hm one offered 
tor a similar reconmendation to Congress. 

Access to and Specification of Reports 

We recomend that the Department adopt a policy to.- 

. adhCTe to a clearance rule which makes evaluation reports 
available after a specified period of time. 

. specify cOTpletely the eveluation documaits referred to 
in the Departmait's Annual Evaluation Report , the Federal 
Register , and policy statemaits. ~~ ~~ 
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. Include, in evary major avaluation r.port a list 

?o SLSf clearance process. We also found It ilflLult 

to identify reports precisely, when they were cited as evidence of thf 

oflMsrof"'""'.'"? d-«l°Pi-S Lgulationa or polLr T^f abs.ne. 
MtLtlal , "Clpients of reports makes It very difficult to identify 

quence is chac what is useless or useful is less verifiable. 
The Use of Evaluation nmm„lfm 



^.l^^^^ ^ unit s,aff or 

. provide oral report, regularly as wall as written reports 
on results of aajor evaluations, and on the uses to which 
results can be put, to relevant Congressional staff Ind 
support agency staff and the program staff within the 
uepartinent , 

. create a systm to periodically collect, synthesize, 
and report specific uses to which evaluations are put. 

. improve the Annual Evalu ation Report by citing instances -^f 
use more speciticaily. ' cicang instances of 




evalStion ^ri^^ . ^^"^ ^"'^^y f^di^-i that use of 

evaluation results is not tracked consci«itiously and ?he belief that 

to beftL tracked to learn how to do evaluations better, and how 

to better allocate evaluation resources. The rationale for the last 

Implmtentat ion 

We recomend that the Department! 

. J^^utlnely require formal measurement of the degree to 
v*lch progran plans match actual operations, 

. adjoin research on methods of measuring Implementation 
to the introduction of new programs and program variations. 

' ^f^*" f central information systen on the 

t toe and resources required for full topls»«,tatlon of new 
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The main reason for the first recomendation is simply that measute- 
oent of implmentatlon of Innovations la Infrequent. The reason for the 
second recoraaendatlon is that we teiow little about cheap effective methods 
of measurement In this arena. The third recommendation stems from the 
absence of any reasonable empirical luidellnes on the time and resources 
necessary to Implffliait Innovative programs. 
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I. INTRODUCTION 



1.1 THE PURPOSE OF THIS RIPORT 

This report concerns evaluation of federally support.d educational 
programs at th« national, state, and local levels. It was undertaken In 
response to Section 1326 of the Education Amendments of 1978 (Public Law 
S5-561), which requires that the Commissioner of Education conduct a com- 
prehensive study of evaluation practices and procedures. The questions 
covered here are those Implied by the law and the conference reports pre- 
ceding It, and those enumerated In the Work Statement for this project: 

. Why and how are evaluations carried out? 

. What are the capabilities of those who carry out evaluations? 
. How are the results of evaluation used? 

. What reeonmentatlons can be made to improve procedure or practice? 

We discussed the questions with Congressional staff and federal agency 
personnel to clarify them. The more detailed questions ar- elaborated in 
the body of this report. 

The Project on Evaluation of Evaluations Is prospective in Its ori- 
entation, designed to provide evidence and argument bearing on these questions 
and to provide recomnendatlons which will help to meliorate the problems we 
have identified. Pertinent excerpts from the law, conference report, and 
Work Statement for the Project are given In AppendlK 1 to this report. 

1.2 PROJECT STAFFS QUALIFICATIONS AND INDCTMDMCE 

Northwestern University was given responsibility for conducting this 
study. Staff members for the Project were drawn primarily from the Uni- 
versity, and their efforts were supplemented by consultants from universities, 
education agencies at the local and state levels, and private institutions. 
The staff manbers were selected so as to Include Individuals with expertise 
in evaluation methods and policy, education, law, management, and psychology. 
A description of staff members, consultants and their roles, is given In the 
Appendix of this report. 

Neither of the principal Investigators In this study, Boruch and Cordray, 
bid on competitive evaluation contracts awarded by the U. S. Office of 
Education, the National Institute of Education, or any other agency. This 
appraisal then Is admtolatratlvely and fiscally Independent of school districts 
states, the federal goverraent, private vendors, and other regular contractors. 
We are not independent of federal agencies In the sense of having received 
grants from the National Science Foundation and Nil for research, and having 
participated la advisory boards to other agencies with and without payment. 
To assure that other independent appraisal of evidence In this report is 
possible, we furnish references to published and unpublished work. Unpub- 
lished material which does not abridge privacy of individuals is stored at 
Northwestern University and will be made available to other analysts so lona 
as resources permit. 

ERIC -^^ 
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1.3 rai EESEARCH STMTlGYp SOURCIS OF ITlDINCEg AND THE ORIENTATION OF 
THE PROJECT 

To TOTOOT the questions posed emrllM, the pMjaet staff rsliad on 
twQ bxoad . SQuraes of Information i contraporary rss^TCh md davelopm^t 
by othM rasearoherip md direet tovestigatlons by the pMjeat staff. 
Roimd-table discussions wM^e organized to Mlarge on Information obtataed 
from these souroas. In considering contemporary research we revlawad over 
400 recent published and unpublished raports on fimctlonsp conduct, mnd 
staff of evaluations at local, state, mA federal levals, Md pertlaait 
tastimony to ingress. Kaw data were acquired and reanaiysed whm this 
was warranted and time ras sufflclmt to assay the data's quality. This 
review of oth«s' work \mB supplMmted by int^\rlaws with the individuals 
responsible for the production of soma of the reports, such as govammant 
staff and contractors. The results of the review ara mployed tteoughout 
the raport, A consolidated reviw of tha work is presented to tha Appandto 
and our reviaw of evaluation policy has hmm publish^ in Review of Research 
in Education . C 

Tha second major rasource was information collect^ by the Project 
staff diractly from todivlduals on site, in telephone interviews, and through 
roimd^table discussions. The purposes of this ^erclsa wera to assure 
understmdtag of earlier work rad to fill la gape to what was kaown about 
valuations whm the Project began. Such totarvlews vera assmttol in 
corroborattog avldance about the usa of evaluations^ covered to Chaptar 6^ 
for sample. At the f^eral laval, the totttvlews focused on staff of tha 
Office of Evaluation and DlssCTtoatlon at the U.S* Office of Education. In- 
terviaws were also conduct^ with mfflbers of the National instltuta of 
Iducatlon, pertinent CQngrasslonal staff, the Graeral Accotmttog Office, 
the Congrasslonal Budgat Office and the Congrasslonal Rasearch Service, The 
mato criteria for selecttog raspondmts to aach of these cases wer^^thelr 
Imowielg^blllty about ^ucational evaluation. 

At the state level, slta visits wara made to ate stata education agancies 
HJnnesota, MlchlgM, Galiifsmia, T^as, Naw Jersey and £tesaactnisetts. Thay 
were select^ randomly from a stratifi^ list of sites* Florida was chosM 
purposively for a site visit on accomt of Florida's use of particular 
evaluation approaches to Tltla I support^ compusatory education programs* 
ColTObus, Ohio, sarv^ as a pilot test site. ^proKlaately 50 telephone 
totarvlews wara tmdertakan to the r^oiatotog statas to supplamoit site visits 
with statistical charactartoatlon. 

At the local level, twelve school districts were select^ randomly 
for slta visits and totmsive case study from a stratified list. Three 
districts that decltoai ware replace by tteae selectad randomly from a 
matchtog list. The rasulttog sample of cases tocludess 
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Jersey City, New Jersey Springfield, Massachusetts 

San Diego, California Lansing, Michigan 

Grille, Texas Colorado Springs, Colorado 

Gaston County, North Carolina Kanawha County, West Virginia 

Broward County. Florida st. Paul, Minnesota 

St. Louis, Missouri Jefferson Parish, Louisiana 

Florwf^f * Providence, Rhode Island, Marlon County and Osceola County, 

to avfluatSJ Stl 1 " "'"'^ ^^"^ °' ^-^^ sophisticated approaches 
to evaluating Title 1 programs. The individuals to whom we spoke in each 

members, and evaluators. The case studies were supplemented by over 200 
he Lc"l l^veT'^Th f^^^^^^ having evaluation responslMllty at 

Of LEA^ Thrh ,^^%«^Pl^ ^8 selected randomly from a stratified list 
Of LEAS. The design Is described in Appendix 3 of this report. 

Private and public organizations serve as contractors for federallv 
llSaS research on Iv^LatJonf l"''^ 

or reactLnftf^he" --P^^yi^B information, corroborative evidence, 

or reactions to the questions we were asked to address were staff members 

pokt^r^Bureaul"'.^? Associates, Bay Area Rasearch, NTS ResearS Sr= 
Sa"af L^d Srnf".. "^i/^^^" Research, Systems Development Corporation, 
elucSiof^LSf f Educational Testing Service, state departments of 

education, local research and evaluation units, and others mentioned in the 

Round-table discussions were organlEed to consider particular tonics 

^^ifn'f nf^'f "^^^ °" the basis of their expertise? The tojics 

included "School Boards and Evaluation," "Evaluator CapabiUties " "UtlliL 
Som"L' ""f ''y-"i-*l Education." Participants included represaitat^ves 

Sti^ ??°iasf "'V' ^'""^^ ^"^^^ (Illinois)! Mesa (A^Lona), 

Austin (Texas), states such as Minnesota and California, research oreanlz- 

tablVnaSl'i"^^^ ^dependent university r'esearchSs' IS^ 

table participants were treated as consultants. 

reDort"L''^'/^^f P^™" presenting more than fragments of our 

report in professional forums. We have been able to capitalize: however 

Iessi^al'or'*"J"^r " '"""^^ ""^^^^^ °' twS nationil pro- ' 

Co^f^ organizations, at meetings of the National Acad^y of Sciences 

oJ^^zed' bv the'S"" " professional meetings on evaluation 

organized by the- Dapartmait of Justice and the National Institute of Education. 

fi^.i^^^^ Project formally began September 29, 1979, and submission of a 
final report was scheduled for June 30, 1980. Because of this tight schedule 
Z^'f our attrition primarily on evaluations m four program arefs ' 
comp^satory education supported under Title I, education lor the Sndl^apped 
Jir^atf / """''''S*"^ bilingual education. These served only a^ ge^Saf* 
targets and wa capitalized on some work m other areas, notably career Education 
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day care education, innovative projectSp mid higher education* Our early 
diseuaslons with federal agency staff and CQngresslonal &taff led to an 
agreraent that our findings and reconmendations should, however, be directed 
to evaluation generally rather than evaluation in specific program areas* 
We have abided by that agrement generally in this report* 

The general orlmtation of the Project is prospective as agreed during 
meetings with Representative Holtzman's staff* This study has not been 
dedicated to identifying evaluations or evaluation staff that are particularly 
inept. Bather, the aim has been to develop useful general statments and to 
make recommendations based on available evidence, rather than to publicly 
criticize particular individuals. Some rneommendatlons are, however, dedl» 
cated to reducing the likelihood of InCQmpetence and to facilitating in^ 
tegrlty* The InforMtlon we have collected on ijidlvlduals Is understood 
to be confldaitlal and for research purposes alone, helpful in making de- 
cisions about quality and use of evaluation, and not for making decisions 
about specific individuals* 

A detailed description of surveys, case study selection, methods round- 
table participants, and other technical details on information sources Is 
given in Appendix 3* 

1.4 ORGANIZATION 07 THIS REPORT 

The report is organized around the mjor questions we were asked to 
address. Chapter 2 considers the rationale, evidence, and opinion bearing 
on why evaluations are done, the confusion and argument «igendered by general 
demand i for evaluation, and the audiences to whom evaluations are addressed. 
Chapter 3 addresses the question of how evaluations are executed. Chapter 4 
covers the organisation of evaluations and the capabilities of evaluators, 
and Chapter 5 considers quality of evaluations. The way evaluation results 
are used is considered in Chapter 6. The clmpter includes case studies on 
the use of evaluative infomatlon* Chapter 7 covers reconmiendations * 

1.5 PRELIHINMY DEFINITIONS 

Section 1526 of Public Law 95=561 refers to "evaluations at federal, 
state, and local Iwels." To avoid confusion, we exploit existing federal 
agency guidelines and recognize that there is no imiversally accepted defin- 
ition of evaluation. Consequently, we adopt a wrklng definition and catalog 
a set of questions often addressed in evaluation. Evaluation is defined 
here tmtatively as a study designed to answer questions about what a program 
does in the interest of maktog Jud^ents about the program* The questions 
often addressed include i Who is served? Wiat services are delivered? 
At what cost? With what effect? A more elaborate description is g±vm 
in Chapter 2, along with a discussion of the diverse meanings attached to 
the word by acadCTlcians, legislators, progrm managers , and others* 
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Local education agency (LEA) Is definad as a school district or or- 
if deli T' ™" -ithm tha school district. State Iducallon L^ev fi^RA'l 
cItlonS ^ state organization responsible for adnlnistertog L y ed u^ 

cational program at the primary or secondary school level. Locll and state 
aSncv'llf ff to "tivities at each level. References to federal 
ffSS m "^^^^ ^Sency is pertinent. tontractorTJerr" 

an evfl^^f ^ ^ " ^Jf''"" organization funded to design or conduct 

and sf^! ' " P""^--^ technical assistance through training, analysis, 

datlon^%!f Statement for this Project refers to proposals and recommen- 
u! j!iL! °" -°°d1flcation of ev.l,. ^ tion f^afHcSTlSr^roSeduFes ■ 

we define this further to mean suggestions of the fol lowing kinds? 

. clarification of language on practices and procedures , 

' aftumeration of options on practices and procedures, 

■ specification of prjjiclples which should underlie practice 

. specification of direct Implications of contemporary evalua- 
Bi-ons for practice and procedures, ~ ~~ 

• establishing priorities . 

• specif icatj.on of actio n or decision . 

1.6 AGINCIIS Wira RESPONSIBILITY FOR SUPPORTING AND CONDUCTING EVALUATIONS 
AND DIVELOPING EVALUATION METHODS uiNuuuiiKi, iVALUAilUNS 

fOED^^!;«.";f; °* Education's Office of Evaluation and Dissaninatlon 

Thf^? % P f responsibility for evaluation of OE's education programs. 
The main focus of this Project has been on OED activity. Other agencies 
at'tl^e 'o!"'f f I-^stitute of Education, do undertake evalLtlvelork 
had r^soon.?hn^f* I agencies such as the General Accounting Office have 
as welP^rf^ ^ for activities which can be properly labelled as evaluation 
tL ■ overseeing evaluations undertaken by other federal agencies 

The following remarks briefly describe the pertinent agencies and the scope 
ment of «c^t changes engendered by creation of the new Depart- 

attachfd ^" organization chart for the Department is 

Over the past five years. The Office of Evaluation and Dissfflnlnatlon 
has been responsible for evaluating programs administered by the U. S. 
tf stLf .f^T^r f ^ coordinating dissemination of exemplary materials 
evafu«Jin /°f f""5"" agencies. The major routine exceptions to the 
evaluation mission have been the Bureau of Education for the Handicapped. " 
^?\,hf f Financial Aid. and the Follow Through Program, each 
cLated i" 197j"f " --Juation funds. The Office's prld^ess'r' 

ilotlt !?' undergone- changes in title, such as the Office of 

^lai^i, nifj ^" responsibilities. As of March 

sponslM^ ? -'J *° professional staff members, om has been re= 
sponsible for administration of an average of $21 million annually In 
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evaltiation funde since 1975. Its avaluatlon reaponsibilitias have Included 
decidtog whether to evaluate, designing evaluations, issuing evaluation con- 
tracts, monitoring, reviewing, and critictztag evaluations whioh it supports. 
OED routinely prepares s^maries of evaluation for dlssraination and has 
routtaely distributed both si^aries and reports. The evaluation miit of 
OED has been divided into three units, each being responsible for a specific 
education sector* Elaflentary and Secondary Education; Occupational, Handi- 
capped, and Development Progrrasi Post-secondary Education/ A fourth 
Division of Educational Replication was responsible for diss^toation policy 
and diss^inates educational products approved by the Joint Dissmlnation 
Review Panel ( jDRP) , JDHP has Included staff mOTbers of the Office of 
Education and the National Institute of Education. OED released its ninth 
^nual Evaluation Report covering fiscal year 1979, 

Under the new Departmmt of Mucation, the evaluation staff of OED 
were transferred to a new unit within the Department - the Office of the 
Deputy Assistmt Secretary for Evaluation and Program Managment, headed 
by Jolm Seal, Program Evaluation is one of three divisions within the unit. 
The r raining two are the Division of Management Evaluation and the Division 
of Hanagment Planning and Assistance. The Evaluation Division, according 
to internal policy statasents, has the responsibility to conduct Jspact 
evaluations and formative or process evaluationsp and assess alternative 
program strategies and structures. The Division will continue to prepare 
requests for Proposals and review and approve evaluation contract proposals. 
The Dlssminatlon staff of OED have been transferred to the Of f ice of Dls- 
STOination and Professional Improvement in the new Office of the Assistant 
Secretary for Educational Res^rch and Improvmmt . 

The Education toendments of 1972 (Public Uw 92-318) established the 
National Institute of Education as part of the Education Division of the 
Department of Health, Education, and Welfare. NIE is charged under its 
enabling statute with building an effective research and developmmt systm 
in the interest of improving Merican education. This Ijicludes adminis- 
trative and fiscal responsibility for educational laboratories and centers, 
such as UCLA's Center for the Study of Evaluation, and for the National 
Assessment of Education Progress run by the Education Comisslon of the 
States. The NIE acts as a foimdation in making grants to independent re- 
searchers in universities, colleges, educational organizations, state and 
local educational research units and awards contracts for larger scale, 
appli^ research on special research topics. 

Evaluation-related work includes developmait and testing of new methods 
of evaluation, new solutions to probl^s of data access and dlssemljiation, 
new methods of testing and observation, and development of guidelines on 
evaluation for use by local and state education agmcies. The methodological 
work is administered by Nil's Test tog. Assessment, and Evaluation Division 
and is carried out by the Northwest Regional Education ^boratory, for 
tostance, as wall as by independent researchers, NIE has no routine re- 
sponsibility for evaluation of prograns administered by the U, S. Office of 
Education. But evaluation of smaller new pilot programs whose development 
Is supported by NIE falls within Its research mission. NIE has at times 
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been given a special Congressional mandate to avaluato. e.g.. the Com- 
pensatory Mucatlon study of Title I supported programs and the 
currant Study of Vocational Education. And It has been directed occasionallv 
bv the Secretary of DHEW to undertake sneclal evaluations. Current efforts 
taclude assessments of the Reverend Jesse Jackson's Push for Excellance 
^^f^™ ! ^ program. In the new Department of Edu- 

cation, NIE is a part of the Office of the Assistant Secretary for Edu- 
cational Research and Improvement. 

hin.-^l Bureau of Education for the Handicapped has had a routine responsi- 
bility for small scale evaluations of projects for the handicapped as part 
of Its development mission. The vehicle for this activity Is BEH's Division 
?l77 tS^'S ^ Development. A unique major evaluation activity during 
ifni'l? ^ developing evaluation plans for Public Law 94-142, a statute 

mandating free and appropriate public education for all handicapped children 

t^f??"."? ^""'"'"^ ^"-^ «.ecution has been lodged with 

m.Lf! P5°f"a Implmientatlon Studies Branch. The evaluation focused on 
questions Similar to those typically addressed In studies run by other 

^^t^^f services? To what extent is the 

Intent of the Act met? And so on. 

Office Assistant Secretary for Planning and Evaluation In DHlW's 
Education Division has had responsibility for review and synthesis of 
;^aluatlons executed by other agancles, and using of evaluations to planning, 
^e number of actual evaluations carried out since 1978 has been small| most 

1 °" Pl«"^l"8- The Office of the Assistant Secretary to 

DHEW s Evaluation and Technical Analysis Division has responsibility for 
small evaluabllity studies - examtolng the ^tent to which program objectives 
are maasuraole. ^ ar « ^ - 

law t?"i?l^T^ Educational Statistics Is mandated under the 

±aw to collect and disseminate statistics and other data related to edu- 
cation to the United States and to other nations." HCES»s activity is 
conftoed to descriptive surveys rather than to evaluations, but the tofor- 
matlon generated is material to design and execution of evaluations. That 
information includes, for tostance. listtogs of school districts and other 
wh?^h f"? ™^ characteristtos, and other toformatlon 

Which is helpful m designing special purpose evaluations. More generally 
the descriptive statistics generated In surveys, on expenditures, pupil 
Sdn^^^f ' ^"^o"»ents. and the like normally serve as backdrop for 

educational research and evaluations. Some data collection efforts, notably 

c?Lf of ^'Sn °* HiSh School Class of 1972 and of 

class of 1980 serve as the anpirical basis for theoretical analyses of the 
process and outcomes of education. This tocludes efforts by acadmlc 
Iatfr«?f ^ ft estimate the effects of federal programs based on the survey 
data rather than controlled field evaluations. 

Act of"iq7i^Ji^JJ"T°^ the Congressional Budget and Impoundment Control 
^t of 1974 (Public Law 93=344). the Comptroller General Is required to 
review and evaluate the results of Government programs and activities. . . 
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when ordared by either Housa of Congress, or upon his omi inltiativep or 
when requested by any cotmlctee, , .or Joint comittee. , .having juris- 
diction.'* In ImplmmtMg the lawj the General ^coimting Office created 
a new Progrm toalysis Division to develop perspective, pollcys and guide- 
lines on evaluation and oth^ approaches to understanding performance of 
federal progrms. The Himan Resources Division has had major though not 
exclusive responsibility for assessments of ^ucational progrms. In 1976, 
for instance, it reported on problems and needed improvments In evaluating 
education progrms. Current efforts include developing a report on the 
quality of evaluations. The evaluation ^related activities of PAD cover a 
wide range includtog development of status and issues papers on the topic ^ 
developseit of guidelines for assessing quality of evaluation reports, and 
guidelines and policy for reconciling privacy problms engendered by 
federally supported social research In general. In 1980, a new Institute 
for Progrra Evaluation was set up within GAO partly to consolidate in- 
itiatives which cut across divisions such as P^ and Finance and Management 
Sciences. The 1974 mandate and subgequmt activity reflect a notable de- 
parture from earlier roles of the GAO, going well beyond the managment and 
accomtJjig onphasls of the 1950 's and brlngliig a wider variety of skills and 
Interests Jjito the organisation. 

The National Science Foimdation's prtoary mission is support of scien- 
tific research. NSF has been Involved In the developmait of science edu- 
cation programs and to a limited extmt In their evaluation. The agency 
has also supported basic and appll^ research on methods which ultimately 
find their way Into evaluations in education, health, economics, crtainal 
justice and law enforcement and other areas. This includes, for instance, 
support of the development of a state of the art work on the use of formal 
field expertoants to plan and evaluate social programs, and on solutions 
to managerial, scientlfiCj and political Institutional problras engendered 
by such field tests. 



State and Local Education Agencies wi th Responsibility for Supporting or 
Conducting Evaluations 

At the state level, organizations responsible for evaluation differ 
from one education agency to the n^t, and from state to state. No single 
organizational entity is responsible for evaluation. Within a school district, 
responsibility for evaluation may be organiEed along progrm Itaes, a Title 1 
program manager being responsible for evaluation, for example. Or, the 
responsibility may be vested in a research and evaluation unit. The responsi- 
bilities Imposed by federal law are discussed generally in Chapter 2. Organ- 
izational arranganents are considered in the chapter on manpower capabilities. 

1.7 ELEMNTS OF M EVALUATION 

The word evaluation Impliee different things to different people. To 
avoid some chronic misunderstandings here, we enumerate the elmmts here 
briefly. The elaimts are. In principle, desirable Judging from guideltaes 
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issued by profasslonal organiaations, by Congressional support agencies 
such as the General Accounting Office, and by federal evaluation agencies 
They are not always a matter of practice • 

• Deciding to evaluate and choosing the questions to be 
addressed by evaluation 

. Deslgntog the evaluation, Including sample design 
. tontracting for the evaluation 

. Conductljig the evaluation and pertinent side studies 

, Analyzing results, making reconmendations, and reporting 

* Using the results 

The elments are discussed in Chapter 3, on how evaluations are conducted. 
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CHAPTER 2. WHY ARE EVALUATIONS UNDERTAKEN 

Rob art F. Boruch, David S. Cordray, Joe S 
Cecil, and Laura Leviton 



Proiram Evaluator: Demoeritua said that he would 
rather discover a single cauial connection than 
sit on the throne of Parsia, 

Program Manager: Some of us would rather sit on 
the throne of Persia, 
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2, WHY EVALUATIONS UmmTAXmi 
TO WHOM ARE RESULTS ADDRESSm? 



This chapter discusses evaluations are carried out and identl- 
flea some of the audiences for evaluation results. The most direct rea- 
sons for evaluation are considered in the next section with short lllus-- 
trations from local, state, and federal ^erlence. The justification 
for evaluating is complicated by ambiguity In the use of the word eval- 
uation and we also consider this topic » The audiences for evaluationi 
the groups for whom avaluations are produced and those exhibiting taterest 
in evaluation results are discussed in 2.2. We present information on 
statutory mandates for evaluation In Section 2*3 because the laws are 
an Immediate Justification for evaluation, but certainly not the only 
one^ at all levels of government* Section 2*4 presents statistical In-- 
formation about the questions addressed by local, state, and federal 
level evaluations. Section 2*5 reviews issues in the decision to eval- 
uate. 



2*1 THE QUESTIONS ADDRESSED BY EVALUATION 

In principle J most evaluations are carried out to answer one or more 
of the following questions^ 

i 

Wio Is served by the progrm? Wio needs services? 
Wiat are the services ^ how well are they delivered, and 

what do they cost? 
Wlmt are the affects of services on reclpittits? 
t^at are costs and benefits of alternatives? 

Moreover, the Information Is obtained to facilitate making judpnents or 
decisions about some aspect of the progrm* The audiences for the Infor- 
mtlon depend partly on which questions are answered and nay Include 
policy makers, managers ^ and oversight groups. 

This description Is, of course, deceptively simple. Matters become 
eorapllcated quickly once the decision to evaluate Is made. So, for 
instance, the Congress ionally mandated Compensatory Education Study of 
Title I progrms focused on only one audience. Congress, deciding that 
earlier efforts to accomodate all posalble audlmcesj such as Interest 
groups and federal progrM managers, were inappropriate* That decision 
engendered probl^s in dealing with audiences whose Interests were not 
addressed directly* The Study was remarkable In being asked to address 
questions about fundam^tal purposes of the, program, along with more 
general questions of the sort raumaratad earlier. New questions were 
adjoined to tha ones specified originally by Congress so as to satisfy 
special concerns. The questions thamselves had to be translated Into 
rather more spaciflc form to be useful* The translation resulted in tha 
Study *s focusing on foi^ broad topics-deallocation and distribution of 
funds, ralatlons among goverwnent agencies to regulating and managljig 
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program activities, delivery of sarvlces, and changes in abilities of 
students particlpattog In the program. The equal priority attached to 
each of these themes ran counter to then prevailing opinions which then 
attached highest priority to estlTnating effects ori'children. The latter 
was found not consistent with the Congress's view that other aspects of 
the progrM were equally important. The process of negotiation and 
approval on questions to be addressed, on the strategy to be used in 
answering thm, in anticipating how the information would he used took 
over six months- 

There are, of course, a variety of reasons why evaluative queations 
are asked In the first Instance. We cannot deal with all of thm here, 
but three of the most salient are worth mentioning. The fundamental one 
concerns children* Programs that are evaluated are dedicated to aspects 
of children's well-being which can be Influenced by schools. To the 
extent that evaluation helps one understand whether a child is served^ 
how well he or she is served ^ and whether beneficial effects are detect-- 
able J then evaluation is part of the program and seeks the same ends. 
The second reason is simple accountability. To the extent that programs 
are directed toward special needs and they are expensive in the short 
run, it makes sense to establish that they are not diverted in unproduc-^ 
tlve ways, to determine if they warrant Improveinent ^ The third reason 
bears on compromise* This Includes, for exMiple, suggesting that pilot 
tests of a progrmn, an evaluation of a special kind, be undertaken before 
a massive new program is mounted, in the face of vigorous but insufficiently 
Informed enthusiasm. The issue of the new program then is shelved until 
more information is obtained. 



National Level; Illustrations 

The evaiuation strategy of the U*S. Office of Education has been 
pertinent to the questions enimerated above. Elaborated in USOE*s 
Annual Evaluation Report ^ the strategy Includes clarifying feasible goals 
for program j identifying modifications in program content or administra- 
tion to improve programs, and identifying especially effer.tlve projects. 
These core questions are also explicit In the GAO's guidelines on ijapact 
evaluation. Judging from interviews with CBO staffs the questions are 
typically those addressed by staff mKnbers in educatloij as well, 

ikiswers to questions about who received compensatory education ser-- 
vices under Title Xj for example, have been provided most recently by 
the Sustaining Effects Study and by the National Institute of Education 
Compensatory Education Study, ^ That head counts such as these are decep- 
tively simple is apparent fromi for example ^ the discrepancy between 
estliaates yielded by state reports of students served (5 million) and 
estljnates yielded by the NIE Compensatory Education Survey (6 million). 
Determining who receives and who might receive services under Public 
Law 94-142 for the handicapped was undertaken by the Bureau of Education 
for the Handicapped* This effort was remarkable in finding a major 
dlacrepancy between estimates of the number of handicapped children pro- 
vided in the law^ eight milllonj according to Section 601 of U.S*C. 1401, 
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and estimates based on field reaearch--4 million in 1977-78. rnls, of 
course, has major implications for appropriations. ' 

Questions about wiio receives Title I federal monies for services, 
at what level, and of what kind have been addressed in the early reports 
of the Sustaining Effects Study supported by OED. The work focused on 
alternate ways of defining children's eligibility and Institutional 
Eligibility, and Che probable effect of changes in rules. The actual 
operation of certain programs was the focus of Rand 'a study of programs 
supported under Title IV of the Civil Righto Act and dedicated parlly ^ 
Co resolving education problems engendered by school desegregation. 

Efforts to estimate the effect of programs on children are not under- 
taken often, partly on account of the resources required to mount high 
quality tests. Among early efforts, we include controlled field experi- 
ments such as the one conducted by the National Opinion Research Center 
and dedicated to estimating the relative effects of Emeriency School 
Assistance Act prograins in facilitating education in schools undergoing 
desegregation. The more recent outcome evaluations Include efforts to 
estimate relative effects of compensatory education programs such as 
Follow Through, of blliniual education, and of Title I. The analysis 
and the ensuing debate over interpretation over what one could infer from 
the Follow Through evaluation illustrate the difficulty of such outcome 
evaluations. The matter is considered later In this report. 



State Level; Illustrations 

At the state level, answers to the question "Why evaluate?" are 
limited by the fact that many states take as their responsibility te 
nlcal assistance and coordination rather than actual evaluation. Thfe 
Immediate reason for any involveinent is federal law. But some states 
have taken a strong Initiative to develop sophisticated approaches that 
are consistent with federal detnands and exceed them. One of the tpplcal 
incentives for this is that some states have their own programs running 
in parallel to federally supported programs. State education agencies are 
normally responsible under the law for disbursement of funds provided 
under Title I and for reporting and periodic monitoring of local programs 
supported by Title I. The reportini on "outcomes" often consists of 
consolidation of achievement test results supplied by local education 
agencies. Where the states have a reporting role in compensatory educa- 
tion, in vocational education, and others, reporting focuses on consoli- 
dating information about who is served, the nature of services, and 
efficiency in delivery. 

Some states exceed federal requirements In having developed remark- 
able evaluation reporting systems. Minnesota, for instance, requires 
specification of objectives and evaluation plans for education programs, 
and Minnesota law on educational planning and evaluation Is distinctive! 
Maasachusetts law requires that local education agencies specify clearly 
how Title 1 programs are modified on the basis of each evaluation. Cali- 
fornia incorporates requiraaents for evaluation into Its Conprehenslv© 
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School Improvenient Progr^ (AssCTbly Bill 65| Chaptar 894^ 1977) ^ and 
tha state's Itester Plan for Special Education (Assmbly Bill, 1250, 
Chmptar 1247, 1977)5 ^nd others* 



Local Level I Illust rat ions 

From our site vistts to local education agencies, ve judge that 
government requirCTenta are an ^mediate reason for evaluation of all 
progr ms receiving federal support. But this does not mean that all 
LEAf ^.rely comply with state or federal requirements * Hiera is also 
consiaerable variation across LEAs and across programs wlthta the LEA. 

Variation across Programs . To ba specific, we asked our respondents 
directly why avaluations were undertaken and the priority they attached 
to federal and local d^ands. For Title I programs, about half tha sites 
we visited put local interests in evaluation far above govermient re-- 
qulranients. The ramatolng half do evaluations primarily because they are 
required. There may be argument about what it means to "find out how 
well the program is working,,," in the more active sites, in Title I 
and other progr anis. But that debate does halp to illinninate the evalua^ 
tion and provoke interest. For billngiial education. Interest in using 
evaluations to modify progrfflas takes a slightly higher priority than 
federal raqulramants in our site visits* In vocational education, the 
majority of sites "evaluate" mainly to meet federal requlr^ants. The 
importance of those requirements is overshadowed by local Interest in 
evalxiation In a minority of cases. Innovative projects supported under 
Title IV-C include vocational education and the interest in evaluation 
apart from meeting requirements is clear* In prograTOnatlc education for 
the handicapped, the stress is on meeting government requirements* It 
is a relatively new program, the Idea of avaluations apart^fr era compliance 
is not well developed, and sites engaged in little or no systematic 
evaluation beyond this. 

Variation across Sites , There is considerable legitimate interest in 
finding out whether a program is working in the active sites and pro-^ 
grama. To be sure, there may be argiment about what "working" means. 
Some regard it as an easy question and simply obtain counts of those 
who are served. Others try to estimate unambiguously the affect of 
services* But the interest is explicit at local and state levels which 
go beyond government requirements. 

To illustrate. Site A^s reports on evaluation of an Emargancy School 
Assistance Act program went considerably beyond requirements in attempting 
to estimate the effect of the progr^ on children, teachers, and parents. 
Site D regularly augmrats the data collected to meet Title 1 requirements 
and uses the aupnented information to assure that program objectives are 
appropriate rather than gratuitous and to set annual student achievement 
goals. Questions bearing on the effect of progr^s are not accorded high 
priority because staff believe it's not possible to estimate the effect 
indapendmt of other services to children, in the case of Title I, or 
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bftcause the program has already "deaonstrated its effectlvenesa." Site 
E's comiltment to an office of research and avaluation stemmed from early 
federal mandates to evaluate. The district has increased the office's 
evaluation budget regularly, and this appears to be a result of energetic 
production of balanced reports and an interested, responsive school board 
and administration. 

Not part of the site visit sample, but no lass interesting, are LEAs 
whose staff members have published accounts of their activity. The 
Detroit Public School District, for instance, has cooperated with Wayne 
State University staff to mount well dealined randomized field experiments 
to determine the effects of a preventive mental health progrMo for children 
at risk and enrolled in Title I programs. McDuffie County, Georgia's LEA 
has had sufficient interest and resources to develop a Title I program and 
an evaluaclon which are sufficiently sturdy to pass muster with the Joint 
Dissemination Review Panel, and the program is now being made available 
through the National Diffusion Network. Aiout 45 Title I prograffls have 
been recognized by JDRP. Providence, R.I., is remarkable for attending 
to a wide range of problems encountered by Title I programs and being able 
to document evaluations of tests, program goals, program components, staff 
concerns, parental involyement and other matters. 

On the other hand, Site K generally does "evaluation" only because 
it s required, for example, and this amounts to no more than slaple re- 
porting of test scores. In Site J, the evaluation of the special educa- 
tion program for the handicapped amounts to no more than setting an objec- 
tive, such as "getting a typewriter" and achieving the objective "got the 
typewriter." Neither the school board nor administration appear to be 
much interested in evaluation In Sites A, C, H, J, and K and that disin- 
terest IS reflected at the program level. Those sites which stress meetlnR 
federal requirements are not all uninterested in evaluation. Some res- 
pondMits told us they would like to do more, but local indifference or 
lack of time and money prevent doing more. Some respondents clearly saw 
no point to doing anything more than the government required. 

Reasons for the Questions . The imedlate reasons for addressing the ques- 
tions at the local level is, as we've said, federal requirements. But 
other reasons are as nimerous as those at any other level of goveriment. 
Where school boards are active and interested in evaluation, evaluators 
are accountable to their members. Where superintendents are vigorous In 
supportlJig an evaluation unit, their interest stems partly from routine 
management information which Is essential in operations and some of the 
information needed to modify operations. The benefits of evaluation to 
children emerged indirectly in eonveraatlons partly because there Is some 
reluctance among evaluators to annoiince that ultimately evaluation is 
"for the children." The phrase la used hypocritically as well as honestly 
in the field, and the chronic hypocritical uses make honest uses something 
of an enbarrassment. To Illustrate the benefits, consider Site E's recent 
tests, designed to understand what strategies facilitate children's read- 
ing. An expensive, comnercially advertised regimen was compared against 
an in-school ranedlal teaching program and against one other approach 
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to assay hpw childran parformad. The results demonstrated that children 
learned more under the regimen Qnly when it was exceedingly wll ijnple" 
mented. Since high implMentation was difficult, and since the alterna- 
tives did not fare well relative to the new regimen, the results were 
used to argue that districts should be allQwed to choose between the 
r^aining ones. 

Apart from interest in answering questions, a conmon reason for 
evaluation is verifying to others the worth of a program. The Title I 
director in one site^ for instance^ stressed the fact that he ranted to 
assure that if he did a good Job, there could be an independent appraisal 
of the effort. His interest lies in building a small pocket of credibility 
in a city which has a fairly strong tradition of corruption* We encountered 
sufficient recognition of the benefit of independent verification at other 
sites to believe that this reason is coimon and neither demeaning nor 
dishonest. 

We found no unlfomL attention across programs In estlMting the effects 
of programs on children. Indeed ^ there was informed skepticism in Sites A, 
Di J, and others that estimates of the effects of Title I services could 
be made sensibly, even with federal "models" of how to do It* because other 
services were provided with Title I services simultaneously. In other 
cases, the task of estimation is easier simply because there are no other 
special servlcea* The skepticism Is reflected at the federal level as 
well judging from profesglonal papers published by staff mMbers of ASPE, 
acadTOlc critics- and others. We have not seen the same skepticism re- 
gistered by LEA or SEA officials in testimony to the Congress. Thm 
difficulty of estimating effects generally has not prevented small scale 
tests from being mounted trfiere the opportunity arises. In Site con- 
trolled tests of a vocational education program have been nicely designed 
and eKecuted using Title I?-G funds for innovative projects. Recall the 
Detroit and McDuffie County Illustrations given earlier. The remarkable 
efforts are in a minority to be sure, but they are no lass important for 
that* 

Other reasons for evaluation are somewhat less admirable, Webster 
and Stufflebeam ■ s study of 35 very large schoQl districts^ for eKample^ 
yielded one clear Illustration of a principal asking for a survey which 
would make the district look good. The request appears to have been born 
of desperation since he was subsequently fired. We were told in an inter- 
view with one site that the administrator occasionally asks the evaluation 
unit to Investigate some problem for which a decision has almost been 
made and the evaluation unites task is merely to collect evidence to 
prove the case. 

The interest in "puff pieces" or "hatchet jobs" and their frequencys 
however, appears to be rather low from our site visits and other infor-- 
mat ion* It isi we believe ^ more likely in locally generated evaluations 
of personnel J for Sample , than in evaluation of federally supported 
programs* But evidence on any of this is weak. 
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srltv ^%^Ply «e no problems In assuring Inte- 

Iha LiL ? " °! evaluation, or in assuring candor in reporting. 

The matter ia discussed to Chapter 5 on how well evaluations are conducted. 

Other Ways of Char actarlziiig the Functions of Evaluation 

standl^^ whf " ""^ enumerated earlier are a reasonable framework for under- 
c?asflf? evaluation might be mounted. There are other reasonable 

classification schemes and other vernacular is often used to describe 

hL'^Sln heV L™' f i".P-PO"« -e. specialisation ll evaluation 
nllsh terms to describe what an evaluation is supposed to accom- 

plish. These are considered in the following remarks because they are 
regSatlon."^ evaluation co^nunity and occasionally in iL and 

cr..^ ^^'^° assessment surveys are descriptive studies undertaken prior to 
scooa ."nf .r?"""""" ^ P'^Sram to establish what needs are' their 
Jolve'« ?f ^ ""'.""^ ^" particular target groups. The work may In- 
Zll J i "'''^^-^ °- ''P*'^"" P^P*^ the simplest case. The 

more elaborate may Include formal statistical surveys as well as case 
studies Illustrations from contracts completed under support of OE's 
of AvallfbirJ"""" and Dissemination during 1976-78 Include "Assessment 
of Available Resources of Services to Severely Handicapped Children." 
Such surveys cannot normally be used to estimate the effects of programs 
in the least equivocal way possible. v v^c^mb 

Process evaluation include studies of the activities, operations 
orgapation, and other aspects of a program. If the program is new the 
^amlnation is often labelled formative evaluation and^lmpfles trouble- 
shooting activity, niustratlva projects supported by the OED Include 

"rleSf of th'fT'' f operations. Sss'laborate 
^n^!^ f 1 °^ evaluation may involve short site visits and ' 

managerial case study. The more elaborate approaches can Involve con- 
scientious measurement of how many Individuals receive services or how 
many institutions comply with Instruction, of the degree a^d type of Lr- 

leUvL'v':^ L'""- f °* actlvity'is understLdlng ' 

delivery of the program of services, and adherence to standards, rules or 
instruction, rather than understanding or estimating effects of service 
ine Idea of process evaluations is ambiguous In the sense that it can be 
regarded in most routine form as administrative monitoring throuah con- 
ventional record systems. "8 cnrougn con 

Outcome evaluations attempt to estimate in the least equivocal way 
possible the direct effects of a program. on Its main target group, usually 
children. Where there are multiple target groups and many effects are 
indirect, the activity may be labelled summative evaluatio n or Impact 
evaluation . Where both costs and effects of program varlatlona'MaTstl- 

t^rJi J ? ?f i^'^^L"^^ analysis. Illustrations of 

the type Include NIE supported field experiments on career education 
programs and OED supported field experiments on some of the programs 
mounted under the Emergency School Assistance Act. 
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AdmlnlBtratlva audits are often undertaken by state offices to 
verify that programs Baid to be operating at the school district level 
are Ijideed operating* In Florida and alsewhere^ they are directed to'rard 
Maminlng administratiDn and processes and may involve visits to class^ 
rooms . 

Administrative and technical support studiaa are often supported by 
valuation budgets. These include conferences, technical assistance 
cmtars of the kind supported under Title I, special planning studies ^ 
development of new methodology. 

Ambiguity in What Is Meant by Evaluation 

ThB law rarely asks for specific types of information and never aste 
that a specific question be answered , Furthemore, written and readily 
available background information on the origins of the legislative demand, 
of the sort which appears in Hearings » for sample, is often terse. 
Partly for these reasons, aplnlone often differed about what is intended ^ 
what evaluative Infotroation is essential * and how such information will 
be used* The differences have been registered in taterviews with federal 
agency officials and in their professional papers ^ by at least some of 
the Congressional staff to ^om we spoke* 

Not recognising the variation in interpretations of what is meant 
by evaluation is Imprudent at best and dangerous at worst* The ambiguity 
is COTipllcated by different levels of expertise and by different ^iews of 
the topic within Congress and the agencies and across federal, state, 
and local govMnment* Consider the following illustrations. 

(a) The Director of Research in an education division of the federal 
government told us during an interview that his division did no evalua** 
tions. This was despite a list of projects for FY 1977-79 which included 
four Items with the word evaluation in the title. He said they were "not 
really evaluations His superior, a deputy assistant secretary * who was 
Interviewed lOTiediately afterward, said that •'almost every project we do 
is an evaluation," 

(b) In an interview with a Congressional staffer, we were told that his 
Committee was interested In ef fectiveness ^ that is, how many children 

are in the federal program in primary school ^ junior high and high school. 
In response to the question "what about effects on children" he said that 
this is not what is normally meant, in his judgement, by effectiveness 
among his CoBonittee members even if it is viewed that my among staff and 
members of other comlttees - This discussion is consistent with dlf f 1^ 
culties encountered in the NIB Compensatory Education Study and the long 
negotiation needed to settle on the major questions to be addressed. It 
is consistent with the confusion and "protracted negotiation between USOE 
and concerned Congressional committees" over models for evaluating Title 
I progrws . 

(c) In a presmtation before a National Acadroy of Sciences Cotmittee, a 
Congressional staff member criticized the studies supported by a federal 
educational agency as being "silly evaluations." It mm pointed out that 
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the studies were fflora properly characteriged as basic research silly or 

ioi^^J" * """"^ ^u"'^^ Syracuse Research Corporation of Title I sup- 
Srd ^aSI'nl: ^^^'^^^^^ raspondents in a larga sample saL ley" 
2lSs If Stirr ^°""'^""f t«"ini students." Estimating the 
ettects of Title I programs, a function implied by early government re- 
quirements for evaluation, was apparently of mlno^ lnterest!^f percep-- 
a"di" osii"ol1hl?H students," or in special education pr.grLs 

.on? Children, or m bilingual education as evaluating materials 

v-as confirmed In our round-table discussions and some site visits 

tlL"^?^ te;^llderlng array of senses In which the phrase "proiram evalua- 
Clon IS used In the professional literature forms the stuff of^ ifMee 
Of ?;rT A ^-r^^^ ? °' ^« University of Colorado and Frederick ni^tf 
ll "fn^ii.f'^P''"8h through, with alarming consclentiousnesi, evSuatLn 
as applied sciences," "systems management," "decision theory," "assess- 
"^'^^Z^^^'^'' "J-^^P-^--^" "descriptioror po^Syal." 

t^e buck'tf the fade??"*' ""^'^ Congress passes 

at leaS 4« = f ? ^8*°="^ ^° instruct the Congress about evaluation 
asked a CoLr^« instances. For example, one of our informants at NIE 
Uft It S tf th/^".^^ """"^ P"8ram and its evaluation. 

X. lorleaL^s ^ ^^^^ ^ ^^^^^^ ^ 
^^^^1^^^^^^'^^^ - P...leular1-Sn^^^^^^^^ 
(8) Ambiguity is not confined to education. The most recent UNESCO con- 

o'organ^Ltloi^^^ ^^^^^ alf^ttention 

f ^ ! computer systems rather than to evaluation In the sense 

eft^™.^' f ^ ""^"^^ °P"«te as they should or In the sense of 

"^otavHeen-u ^5°"' ' definitions of evaluation 

numtLS projra^s ?"=Merlcan conferences on evaluating 

use oTtlt ""f of reasons for ambiguity in popular and professional 

T i f ^ evaluation. A fundamental one Is simply that the word 

from thelfff '"^^'^"^ ^^^P^*' di««ences stem pLtly 

from the different disciplines involved. In education, for example It 

Leountinrand'n°olf °""'f sociology, statistics, paychology. 

^?^o =^ If ^.^-^ sciences represented in an evaluation. Differences 

also stem partly from the variety of methods used by a discipline to exao^ 
stltistlci^f; The administrator may bring case studies to bear? The 
statistician normally argues for statistical evidence. The lexical inven- 
^ISrf f ^''^i" scholars, bureaucrats, and politicians compllcater 

"«sBonSiir °^ evaluation," "illuminative evaluation," 

at«nf«f Tk"?"""" f"- neologisms is confusing. In some in- 

of Stl^ IfllLf if ^he activity hinges on people, regardless 

-r title. If A does it, it's evaluatloni and if B does it. It's research. 

38 
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Finally, it occasionally apDears to be prudent to keep Che word vague. 
The d^and to "evaluate" is sufficiently general to permit investigation 
and audit without inciting the fears that these activities do* It has 
been used as a synonym for research when the word research is found offen- 
sive by legislators or parents. 

None of this is peculiar to educational evaluation judging from 
recent work by William Krusl^l at Chicago and Frederick Mosteller at 
Harvard, Terminology In statistics Is used popularly to give scientific 
dignity to stmple Information collection. For example, a writer may 
announce that he has taken a "random smple" to lend scientific legiti'- 
macy to his efforts and we find later that the word hapl^gard is rather 
more accurate. Uses of words like "expertaent," *'audlt," "psychotic*" 
and so on are used promiscuously despite professional communities that 
ascribe to more or less explicit definitions isi each case. 

We stress the matter here because the ambiguity does affect commun- 
ication at federal, state, and local levels of govermmt* It affects 
quality of evaluation, costs, and so on. Some of our recommMdatlona 
bear on the problem . 



2,2 AUDIENCES FOR EVALUATION RESULTS 

In any potmtlal audience for evaluation results, there are pockets 
of sturdy Indifference as well as pockets of remarkable interest. The 
following rmarks Illustrate this variety at national i state, and local 
levels of governance, 

National Level 

The GAO-s recent work classifies the audiences for evaluation into 
three groups i poltcy-^kers , managers, and oversight agencies. Various 
public interest groups which occasionally demonstrate an interest in 
evaluation, advisory groups, program staff, and parents^ and a sizeable 
community of evaluators constitute two nongovernment audiences* Not all 
members of these audiences are equally attentive. 

In principle, the relevant pollcy-'Tflakers Include the Congress, since 
the demand for evaluation of ongoing progrms and many new ones is made 
in law. The case studies on use of evaluation in Chapter 6 suggest that 
Congress is indeed an audience at least at times. There are explicit 
references to evaluations in coomlttee reports, in the incorporation of 
evaluation findings Into bills, and in the rationale provided for appro- 
priations. The pertinent cases which Include the clearest evidence are 
Che NIE Compensatory Education Study, the National Day Care Study, Rand's 
Study of Federal Programs Supporting Educational Change, Bilingual Educa- 
tion, the Fund for the Improvement of Post-secondary Education, and of 
Title 1 Testing, Evaluations of Follow llirough among others are remarks- 
able for the lack of audience reaction, but notable for their use in 
provoking discussion^ 
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The reports of the Senate Appropriations Comnittee, the House 
Education and Labor Committee and others register the Interest of this 
audience but Indicators of the Interest are not always uniformly clear. 
For Instance, the Senate Committee's 1979 Report enumerates appropriations 
and gives a brief rationale for each appropriation. Of the 85 education 
budget Items the Coronittee considered, 11 include reference to evaluation 
as part of the rationale for support, though evaluations are available 
and were used we believe In at least 6 more cases. The absence of eval- 
uative data is mentioned Swlce, though probably over 20 items in the 
catalog have had no formal evaluation at all. 

Our sources of information about Congressional staff interests in- 
clude a recent survey by Florlo, Behrmann, and Goltz of 26 staffers of 
subcommittees and commit cees dealini with education. These include the 
House Comnittee on Education and Labor and five pertinent suhcoiranlttees, 
Che Senate Coiranittee on Labor and Htman Resources and four of Its sub- 
committees, Senate Appropriations and Governmental Affairs Conmiittees. 
Of the 26 staffers identified as an audience in principle, seven con- 
sidered the influence of evaluation on their work to be significant, 
eight said Importance depended on the issue, and eleven said evaluation 
was not useful to them in the form delivered. We conducted interviews 
with eight of these staffers, four nominated by Congresswoman Holtzman's 
staff, the ranalnder by ex-staffers, in the Interest of independent 
review. All were informed about evaluation in some degree. But their 
own opinions about how useful evaluations are hinged considerably on 
particular evaluations— some being regarded as useful, others not—and 
their definition of "useful." One staff member, for Instance, said that 
evaluations were not useful to his committee but also said clearly that 
evaluations were used to guide Conmiittee members' queatlons during Hear- 
ings. Their remarks on how evaluation results are used have been incor- 
porated Into Chapter 6. 

The most relevant Coniressional support staff include four members 
of the Congressional Budget Office who capitalize on educational evalua- 
tion as a policy tool. The persons to whom we spoke regard themselves as 
discriminating consumers of evaluations, and use evaluations In policy 
development. The case study on CBO in Chapter 6 Illustrates their use of 
results. Members of Che Congressional Research Service with responsibil- 
ity in education In principle constitute an audience for evaluation. 
The two people to whom we spoke regarded themselves as brokers of eval- 
uation and research. 

In principle, federal agency program managers constitute an audience 
for evaluation. This involves a presumption that the manager will act 
upon the findings of an evaluation supported by the division of evalua- 
tion, by GAD, or some other reasonably independent agent. In practice, 
chere is confusion: Evaluation Information requested by Congress is not 
necessarily useful to the manager and evaluative Information needed by 
the manager is not necessarily useful to Congress. Advertising that 
evaluation Is a management tool" as 0MB has done implies, to at least 
some policy anaj.ysta, that it will then be less useful to policy-makers. 
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In practice^ there is also corapetition between the program unit and the 
evaluation unit for funds and Congresses attention. Some of the anomolous 
possible consequences of competition here can be Illustrated by a Con-^ 
gresslonal ataffer's opinion that federal managers do not take action on 
avaluations and as a consequmice the evaluation unit's budget, not the 
program budget, is reduced. Under the new Departmmt of Education, the 
evaluation unit is lodged In the Office of the Assistant Secretary for 
Management, ^nat Office ^11, its ^ecutives expect i help to link eval- 
uation more closely to program managment without degrading the extent 
to which evaluation can meet Congresilonal requests* 

The case studies on uses of evaluation given in Chapter 6 reflect 
some managwent Interests in the Infonnation. The verifiable uses include 
changes in federal regulations produced by the National Day Care Study, 
the Nil Compensatory Education Study, smaller scale assessments of Title 
I testing programs, and the Rand evaluations of Federal Progrms Support- 
ing Educational Change. The unwillingness or Inability to use some eval- 
uations for legitimate and other reasons Is reflected in the Follow 
Through case study. The most consistent but small agency audience for 
evaluations which estimate effects of programs Is the Joint Dissemination 
Review Panel, a 22 member board that Included OE and NIE staff* The JDM* 
routinely reviews programa and avidenM submitted by program developers 
at the local, state, or federal level to detemlne if evidence Is suffi- 
cient to warrant providing the program with an opportunity to apply for 
dissemination grants. 

The priority accorded to evaluation by federal managers varies* In 
bilingual education, for example, the reviews of grant proposals involve 
a UO point scoring scheme; fifteen points go to evaluation^ suggesting 
that it cannot play a decisive role in funding decisions in the division 
of bilingual education. Judging from public speeches by some bilingual 
education ^perts, the recent AIR evaluation of bilingual education has 
provoked more interest in evaluation for the sake of self-proteQtion if 
nothing else* 

In Vocational Educationj the law requires the state board responsi- 
ble for program administration to provide an accountability report to 
the CoOTiissioner annually. The specification of content practically 
legislates management as an audience, in that it must contain a "sumnary 
of the evaluations of programs *. ,and a description of how the Infomation 
from these evaluations has been or is being used" (Section 108, Title I - 
Vocational Education) . We are unaware of any~formal assessment of these 
reports or any substantial way in which they meet the requirement* 

At the GAO, there is a notable interest In evaluation Bmong members 
of the Program Malyals Division, a unit was created In response to law 
requiring that GAO play a major role In overseeing evaluations. The 
interest and some of the eKpertlse is gradually finding its way into 
other divisions of the agency. The approaches being developed by GAO to 
guide evaluations are based at least partly on earlier eval^fcive re- 
search supported by the National Institute of Education and mtional 
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Science foundation. In this sense, GAO is an audience for evaluative 
methods. GAO la also an audience, at tines, for the product of evalua- 
tion itself If we Tnay judge by GAO 'a use of evaluation by other agencies 
A case in point, the National Day Care Study, is discussed In detail in 
the chapter on uses of evaluation results. 

The courts are not a regular audience for evaluations. But educa- 
tional policy research, supported at least partly by the federal govem- 
raent and directed at questions which evaluations address, is used epi- 
sodically. Information genarated by an OED-supported evaluation of bilin- 
gual education programs has been introduced as evidence in a New York 
state court case, Clntron vs. Brentwood School District on access to 
bilingual education. In the Supreme Court, Bakke vs; Uhlverslty of Cali- 
for^ mployed data from the OED-NCES supported National Longitudinal Study 
of the High School Class of 1972. Applied research on the effects of 
Integrated classrooms on children's achievwnent, conducted by university 
researchers in pursuit of scholarly interests, has been 'admitted as 
evidence in S^nn v. Ch arlotte-Mecklenberg Board of Education . In Hobson 
V. Hansen (Washington, D.C.) and Key as v. School Dlstriet No. 1 (Denver) 
At the state level. Investigations by Coleman suggest that In Calif omla 
research on the effects of desegregation continue to be used partly as 
a result of a State Supreme Court ruling on admissibility of the evlde 'ce 
Methodological research on evaluation, produced with federal support, is 
also of occasional, direct Interest. For example, the Federal Judicial 
Center has created a Committee on ExperJjnents In the Justice Systan to 
understand how sophisticated evaluation methods, notably randomized field 
experiments, can be used to determine whether and how well innovations 
work. The Committee, chaired by Judge Idrard Re, has included meetings 
at which methodological work supported by the National Institute of Edu- 
cation as well as National Science Foundation has been formally presented. 
We have not had the resources for a detailed exanlnation of how often 
Information generated in the national level evaluations has been used by 
the federal, state, or local courts. The preceding remarks merely illus- 
trate the beginnings of Interest in the judicial sector. 



The Loca l and State Interest in National Evaluations 

Our site visits to local education agencies generally revealed little 
awareness about evaluations produced at the national level. Interest 
generally focuses on locally executed evaluation if there Is any interest 
at all. But there are two important exceptions. 

The first exception involves large school districts which often hav© 
research and evaluation units and states with well developed evaluation 
practices. Those units are consumers of some evaluations and of metho- 
dological research on evaluation which finds its way Into the professional 
journals. So, for instance, the input evaluation unit of the Dallas 
Independent School District's research and evaluation division Is respon- 
sible for monitoring national reports. The examination of reports of 
Title I evaluations led eventually to Dallas's field testing of an Instruc- 
tional approach Identified as successful, DISTAR, In the evaluation report. 
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A eiinilar exception holds for states with well developed approacheg to 
evaluation of state and federally aupported progrOTS. ^e uaefulnass of 
this inforaatlon to state staff is partly rhetorical— to pMsuade someone 
of the value of the pfogrma— -and partly managerial— Identifying good and 
bad practice i For instance, lowa^s Oliver Hlanley cited three national 
evaluations bearing on Title I programs In 1979 ^erslght Hearings of the 
Subcommittee on ElTOentary, Secondary^ and Vocational Education* He appeared 
to be the only one of representatives frOT nine states who knew about the 
studies. Kansas and California have both used the OED-supported Rand 
Study of Federal Programs Supporting Educational Change to modify legis- 
lation Caee the case studies in Chapter 6). Md In other states, such 
as Michigan and Minnesota, we did encounter program staff and evaluators 
who were aware of national evaluation* 

A second major class of eKceptlons to local disinterest or lack of 
awareness about national evaluations concerns evaluations which receive 
considerable professional or popular press coverage. At Site for 
example, the director of bilingual education was fmnillar with the recent 
national evaluations of the bilingual program undertaken by OED. The 
professional debate over the national evaluation Itself led this director 
to be concerned about the quality of the report thMt she received from 
the contractor she had hired to evaluate Site J*s program, and she asked 
us for coimnents. The national evaluation has been covered by professional 
periodicals such as ^i Delta Kappan and by network television, NBC*s 
Tom Snyder (May 9, 1978) included Interviews with federal program mana- 
gers and Ohio Representative John Ashbrook, The Rand Corporation's 
evaluation of "Federal ProgrMs Supporting Educational Change" is another 
case in point. The report itself was covered by at least a half dozen 
local newspapers and at least one s^dicated colmmist. Press coverage 
did provoke diseussloni though it Is not clear how it may have influenced 
decisions * Spencer Rich of the Washington Post covered an NIE supported 
evaluation of Jesse Jackson's PUSH-EXCEL progrTO (April 22, 1980), The 
article, a balanced one, evoked considerable interest in both the program 
and evaluation judging from letters to NIE* 

Documenting instances In which popular treatment of professional 
evaluation has provoked action Is time consuming, and the resources 
available to this project have not been sufficient to do so. No formal 
newspaper clipping system Is employed in any federal educational evalu- 
ation unit we^ve visited, though there Is inforaal attention to this 
topic. 



Local and State Audiences for Local Evaluations 

The most recent report of a large scale survey bearing on audiences 
for evaluation was issued in 1979 by UCLA's Center for the Study of Eval- 
uation, Their effort focused on large school districts with research 
and evaluation units. Respondents in that su^ey, directors of research 
and evaluation units i reported that the most consistent users of their 
reports were superlntendenta and central office staff of the school 
district (602) and principals (53%), About 30% of the directors said that 



43 



2-15 



teachers were conststant users of the infornifltton they gmerated. Schooi 
board mCTibers are sttll l.as lljtely to be a ragular user of the Inforaatlon 
and parents are reported to be least likely to be a consistent user group 
C.9a) . The UCLA survey covers all evaluations generated by a research 
un£fe, not only the federally supported ones. 

Prom our own site visits, we conclude that school board Interest In 
local evaluation varies considerably from school district to school dis- 
trict and state to state. The extrraely active audience Is eiemplifled 
by the school Astrlct In Site E, for example, where we were told that one 
school board mefl6«, who on receiving any proposal for fundJjig Immediately 
tllps to the budget and the line Item for evaluation, and who on receiving 
reporcs about a program asks where the evaluation Is, and lAo calls dls» 
. trlct evaluatora to obtain Information about evaluations directly and to 
remind evaluatora tb^t they've missed a deadline. The school board Itself 
at Site J has required evaluation of some activities every tlu-ee years, 
executive summaries of evaluation In plain English. In Dallas, the 
school board has a formal program evaluation committee ^Ich Includes 
board mmbers to whom research and evaluation reports are regularly pro- 
vided. Such interest mong school boards la, we balleve, exceptional. 
For instance, we found no evidence that Title I evaluations In Site A 
are considered seriously by the school board, though the attention given 
evaluation by the Title I director is clear. Site J«s board Involvement 
is not significant, judging from our Interviews, because management and 
budget issues are higher in priority. 

For Title 1 progr^s, at least one local audience is Implied by 
federal law. Parent Advisory Committees are given responsibility at the 
local level for providing advice on implementation of Title I programs . 
As a natter of practice, the audience sometimes has less access than It 
should to evaluation reports and it may be uninterested. The case study 
given In Chapter 6, on use of valuation findings by Parent Advisory 
groups, covers both active and Inactive sepients of this audience. 

In any given site, the audiences range considerably depending, on the 
nature of the Information produced by the evaluator, and the relations 
between evaluator and various groups within the district. Title I eval- 
uation in Providence, for instance, suggests the folloirtng pattern of 
information needs being satisfied. Systematic assessments of the rela- 
tion between Instruction time and achievement was eventually used by 
Title I teachers and administrators. Assessments of clerical errors In 
records, of cut off points on tests used to assign children to different 
instructional regimens, were used by program management. Assessments 
of class size, of the appropriateness of tests, of classification pro- 
blems engendered by tests, of the appropriateness of program objectives 
baaed on tests are provided to teachers and used by them. The Title I 
Parent Advisory Cotmlttee appears to be a conaistent audience for eval- 
uations showing poor performance In middle schools relative to elraen- 
tary schools and has actively used the Information in pushing for prograw 
modification. 
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In other sites, eueh as Site J and Site F* the fadeMl ^hails on 
achievOTent Iwels of stud^ts Is often of considerably lagg Interest to 
program Managers, superijitendents^ and sehool boards, timn Is inforaatton 
about eWldran'e attttudeap children's eelf^onaepts, parental reac- 
tlona to progrms* 

Consua^ and Producer Education 

One clear ^plication of this and others? toveatlgatlons is that the 
audience for ayaluat Ion shifts considerably to aembershtp and ths need to 
Inform the audience about purposes of eyaluatton Is a recurring one* Site 
J, for Instance, has had flva superlntradmts In ten yearsi the vateran 
school board ftmber has had ste years «psrlTOce and njajorlty have less 
than two. Site A»s school board has had over the past five years a. dramatic 
shift in a^ershtp and part of the progrm evaluator's preblTO In that 
city Is to overcome both ingenuous and Informed suspicion of the new 
board agalTist district staff* At the state level, changes In membership 
of the legislature and legislative staff can be rapid. This puts sub- 
stantial dffliands on state evaluation staff la California, for Instance, 
to eKplaln the purposes, origin, nature md conduct of evaluative actlyl^ 
ties. At the national level. Congressional staff mmber turnover Is 
relatively high, mough thera rematas a core of Individuals who are 
sophisticated about the flaws and benefits of evaluation, the audience 
changes often enough to put considerable dMand on coMiunlcatlon efforts* 
The problem has been similar ^th the Education Division of DfflW. Eight 
GoTOlssloners of Education in a ten year period and a mall army of migra- 
tory ^ecutlves make matters difficult* Brief isgs on evaluation by the 
OE's Office of Evaluation and Dissemination have become rotitlne and ela- 
borate in response to the problem. 

The audience is not the only group being educated. The difficulty 
of COTBuunicatlng about evaluations to a heterogeneous, often nontechnical 
audience, has had the benefit of spaTOlng new Ijitttest In ways of present- 
ing information. Evaluation staff of the Austta Independent School Dis- 
trict, the Baltimore County Public Schools, the Philadelphia School Dis- 
trict have been ranarkable In this respect, Austin, for example, has 
managed to produce a thoughtful manual on reporting, based on hard exper- 
ience. The experience In smaller districts and In districts where eval- 
uation staff are unable or unwilling to provide Inforaatlon about the 
way they approach the problem is almost Invisible. In some eases. It 
is not especially good. Parent Advisory Councils In some areas have 
notable difficulty in acr.esslng reports, for sample, and In under- 
standing what Is being said on account of technical language. It Is 
partly on account of such difficulty that the UCLA Center for the Study 
of Evaluation has created a newsletter to periodically report on effect- 
ive strategies for getting information used at the local level. The 
production of the periodical, U.S.E. (Using School Evaluations), is 
supported by the National Institute of Education, 
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2.3 LEGISLATION ON EVM,UATION 

A fundainental reason why evaluations are done is that Congress wants 
them done, If we may judge from law and from our Interviews at local, state, 
and federal levels. To understand the scope of legislative daiands for 
evaluation, we undertook a search for statutes bearing on the topic and a 
brief analysis of them. 

The main vehicle for search was the LIXIS computetiged legal document ' 
retrieval system, ustag Title 20 of the United States Code. Scope of search 
was restricted to those educational programs most relevant to this Project 
e.g., elraientary and secondary education, vocational education, education 
for the handicapped, and so on. Graduate or higher education programs, school 
nutrition programs, and library programs, were excluded. Moreover, the search 
focused on statutes containing the word evaluation. It then excludes evalu- 
ative activity which may have been labell^ differently. Details of the 
search, problems encountered, and detailed analysis are given in the Appendix 
to this report. Wa believe it is the only attmipt since the GAO's in 1974 to 
map this terrain. 

Recency of Amendments and Enactments 

All the subsections of the U.S. Code containing references to evaluation 
have been placed in the Code or amended since 1968. Furthermore 73% of the 
subsections were amended by the 95th Congress In 1977-78. Over 60% of the 
subsections were amended or enacted by a single act of legislation the 
Education Amendments of 1978 (Public Uv 95-561). The number of citations 
to evaluation subsections of the U.S, Code by year are as followsi 

Year 1968 '69 '70 '7 1 «72 '73 '74 '75 '76 '77 '78 

Number 105070758 11 75 

Character of General Proylsions 

The General Provisions Concerning Education apply to all federally 
funded education prograns "for which an atolnlstrative head of an education 
agency hW "administrative responsibility" (il221(b)). They further ^specify 
that any state or local application for federal funds for an education pro- 
gram must contain an evaluation component and assurances of cooperation to 
providing data for federal level evaluation efforts. The local agaicy 
must report to the state and the state to the Comisaioner of Education 
(§1232d; 1232e), The general purpose of such evaluations Is to "determljie 
the effectiveness of covered progrma jji meeting their statutory objectives 
Cil232d) « 
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At the federal level, there is overlapping and potentially conflicting 
authority for evaluation of education prograns* The Secretary of DHEW has 
had a stoilar statutory mandate with no clear indication how theee evaluation 
efforts are to differ from those of the Conmiissioner of Education (il226c). 
Further, the Comptroller General may conduct special evaluations of programs 
upon request by Congress (§1227). The National Institute of Education md 
the National Advisory Council on Indian Education also may evaluate educational 
programs within the province of thedj concerns Ci§1221e| 1221g) , 

These statutory sections represent the most general evaluation require- 
ments md they are stated In rather broad terms. Various statutory sections 
require needs assessments (S1226c), outcome evaluations (Il226c| 1231a), and 
even cost/benefit analyses Cil226c), Mother section of the Gmeral Provisions 
requires the CoOTnissioner of Education to conduct a comprehensive study of 
evaluation practices and procedures at the national, state, and local levels 
for federally funded elenentary and secondary education programs and this 
Project stems frc^ the requirraent (ii231a). While these sections of the 
General Provisions establish the cornerstone of federal policy on educational 
evaluations^ the influence of these requirenents rraains unclear. The General 
Provisions are to apply unless otherwise specified by federal statute, and no 
such specification was encountered in the course of this research. Further 
information and citation to specific subsections of the General Provisions 
are provided in the Appendix to this report. 

Specific Programs 

To understand evaluation requirOTents for specific programs consider 
statutes on education of handicapped children^ bilingual education. Title I, 
and vocational education. 

The Education of Handicapped Children (§§1401^1461) . Since 1975, the 
Coraaiasloner of Education has had prtaary responsibility to conduct evalu-- 
at ions of special education programs. Either directly or by grants or con- 
tracts ^ the Comilssioner has been required to assess the need for special 
education programs p the e^trnt to which needed services are delivered ^ and 
the effectiveness of these programs in meeting the goals of special edu- 
cation (§1418). State and local education agencies must cOTiply with these 
regulations as a condition of federal funding of special education programs 
(§1412). Each year, the Conmissioner is to file a report with Congress on 
the progress being made in provldtog special education services. This 
report is to include a detailed description of all evaluation activities 
conducted by the Conmiissioner and an assessment of the effectiveness of each 
education agency in providing special education services in the least re- 
strictive environment and in preventing erroneous classifications (11418), 
The Commissioner must also evaluate the effectiveness of each of the model 
centers and experimmtal progrfflns in meeting the special needs of handi- 
capped students (§1425), 



Blllngyal Education (§§3221 - 3261) . The statutory mandate coneernini 
evaluation of bilingual education progrms is vague and sems to reflect 
a cautious approach in establishing rules for a new education progrm. 
In fact, in the statutory statement of policy Congress notes that research 
and evaluation capabilities in bilingual education need to be strengthened 
(§3222), The toamlssioner of Education has been directed to develop models 
of evaluation to be used in assesstog the progress made by partlclpwite 
attaining English language skills- The local education agencies thm Im- 
plement the evaluation models and report to the Conmissioner as a co^ltlon 
of federal funding (13221). The Comalssioner reports to Congress once every 
two years concerning the need for bilingual education and the success of the 
federal programs in meeting this need (13241), While these taltlal evaluation 
efforts are underway. Congress requests the Secretary to develop more so- 
phisticated evaluation and data collection models by September 1980 (§3241), 
The National Institute of Education is also required to develop and evaluate 
effective models for bilingual education (§3252). Further Information and 
citation to specific subsections may be found in the AppendiK. 

Evaluation of Title I Progrms (§§2701 - 2854) . The statutory standards for 
evaluation of Title I progrMs delegate prtaary responsibility for evaluation 
of Title I programs to the local agenclaa (§2734). The state aducatlon agency 
has been directed to provide technical assistance to the local agmcles, and 
to compile the findtogs of the local evaluations in a report for the Com- 
missioner of Education (12822). The Conmissioner^ who must also provide 
technical assistance to local agracles^ then must combine the findings of the 
state reports with national evaluations of Title I programs and present a 
biannual report to Congress (12833). The National Advisory Council on Quality 
In Education also may evaluate Title 1 progrms (§3171), Further taformatlon 
and citation to specific subsections are given ta the AppOTdlK, 

Vocational Education (§§2301 - 2461) . The core evaluation requlrments for 
vocational education Involve the common sch^e of the state education agency 
reporting to the Commissioner of Education, who reports to Congress. At 
the state level * authority to conduct evaluations is retained by the State 
Advisory Council, which in principle must meet detailed standards to evaluating 
local vocational education programs (§2305). 

At the federal level, the authority to conduct evaluations of vocational 
education programs is divided among several agencies with overlapping mnd 
potentially conflicting mandates. In addition to evaluation of vocational 
education progr^s by the Commissioner of Education, federal level evaluations 
are to be conducted by at least three other groups. The National Advisory 
Council on Vocational Education is authorized to conduct independent evalu- 
ations of state vocational education programs and to file an mnual report 
with the Comiissloner of Education^ the Secretary of Labor^ the Congress 
and the President (§2392). The National Institute of Education has authority 
to "study and evaluate" a broad range of vocational educational programs 
in order to make recommendations for the redirection and uaprovemmt of 
vocational education programs In the n^t decade, NIE also has the authority 
to conduct "not more than three experijnental studies" to achieve the purpose 
of the evaluation (§2563). Finally, the Commissioner of Education and the 
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^ Secretary of Labor have jotot authority to evaluate bilingual vocational 
trailing progrraa and file an annual report with the Presidmt md Coneraas 
(12412), 

In addition to theee primary evaluation mandates, a nianber of secondary 
mandates for reiearch in vocational education are included. A national 
center for research vocational education was establlehed to develop 
"methods of evaluattag programs, deluding follow-up studies of program 
completers and leavers" (12401). A Coordinating Coimittee on Research In 
Vocational Education Is established to develop "an effective management 
information systafl. . , to achieve the best possible monitoring md evalu- 
ation of vocational education projects" (12304), Finallyp the Comissloner 
of Education is to study sac discrimination and s^ stereo typljig in vo- 
cational education programs (§2563). 



Other Strategies of Evaluation 

In addition to the GTOeral Prnvisions and the four major education 
progr^a discussed above, evaluation standards for twenty-six other edu- 
cation programs were eKamteed. The standards vary widely across programs. 
For some education progr^s, the local agency is responsible for develop tog 
appropriate evaluation models to be tacluded In the application for federal 
funds (e.g., Metric Education — §2953| Coranunity Schools Program — i3288| 
Dropout Prevention Programs §3387). Other education programs place heavy 
reliance on contractors or grantees to conduct the required evaluations In 
accordance with standards developed by the Conmilssioner (e.g., Media 
Education ~ §541| Consumer Education ~ §2983), Some education programs 
offer no guidance beyond the gOTeral admonition that the federally funded 
program should be evaluated (e.g.. Educational Improvement and Resources 
Support — §3084; Gifted and Talented Children Program — §3315). 



Types of E valuation Activities Required by Education Statutes 

To obtain better understanding of the legislative references to 
"evaluation," the references were classified as indicating one or more of 
the following: needs assesamenti process or formative evaluationi outcome 
or sunmative evaluationi cost/benefit analysis; and an "unspecified" classi- 
ficattpn. "Needs assessment" was defined as those activities directed 
toward determining the pature or extrat of a problCTi, such as a suwey to 
detCTOlne the need for a bilingual education program among migratory chlldrm, 
"Process or formative evaluation" was defined to Include an ^mtaation of 
the nature of the services being provided or an examination of the function- 
ing or operations of the education progr^, "Outcome or aummative evaluation" 
was defined as those activities designed to determine the effect of a progrm 
on the problOT it was intended to solve, or an assessment of the effective- 
ness of a progrm to meeting the purpose of the statute. "Cost/benefit 
analysis" was defined as those activities which compare esttaates of the 
Impact of a program with the costs of providing the services. Any evaluation 
requiring a comparison between the funds spent on a progrra and the results 
of the progr^ was classified as a "cost/benefit analysis," The "unspecified" 
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category included those references to evaluation In which the nature of 
the required activities could not be determined from the context of the 
statute. Finally, an "other" category was included to permit exantoatlon 
of evaluation mandates which did not fit Into any of the anticipated classi- 
fications. 

Explicit statements concernini evaluation of the Impact of an education 
program are much less frequent than more general statraents concerning evalu- 
ation of Che effectiveness of the statute. Only eight of the twenty-nine 
programs or levels of programs for which the statute implies outcome evalu- 
ation either mention or Imply an evaluation of the Impact of the education 
program on the perceived problem (Education of Handicapped Children. State 
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cational Programs ~ §2412 (a) (2) • Preschool Partnership Program — 12917 (b) 
(3)j Biomedical Sciences Program — 13054 (a) (11) i toergency School Aid — 
i3200(a)(ll)). The remaining twenty-one programs or levels of programs re- 
quiring outcOTie evaluations contained a statanent cone smlng an "evaluation 
statute." This phrase occurs with such consistency that it has the character 
of statutory boilerplate, anploy^ when there appears to be general Interest 
in the functioning of the program but there is no intwit ion, consensus, or 
clear preference for "outcome or suamatlve" evaluations rather than "process 
or formative evaluations. If one assumes that the use of this gaieral phrase 
concerning evaluation of the effectiveness of the education program only 
expresses an interest in requiring some form of objective assessment of 
program operations, then these more general mandates can be tak«i out of the 

outcome or summative evaluation" category and combined with the "unimown" 
or unspeciflable evaluation mandates. Only eight of the forty programs trtiich 
require^an Impact assessment can be classified as requiring "outcome evalu- 
ations," and thirty of the forty programs contain one or more of the general 
mandates which does not specify an evaluation question. 

If general statements about effectiveness in achieving purposes of the 
statute are classified as outcome or sunmatlve evaluation, then this Is the 
most conmion type of mandated evaluation activity. In each major education 
program and In seventeen of the twenty-six minor education programs, the 
purpose of the evaluation requirOTent was either to determine the Impact of 
the program or to determine the effectiveness of the program In achieving the 
purpose of the statute. The nine minor programs which did not require out- 
come evaluations all had "unspecified" evaluation requirements, suggesting 
that the absence of language requiring an outcome evaluation did not Imply 
a preference for one of the other evaluation models. 

In six of the forty programs or levels of programs, the evaluation 
mandate referred to a needs assessment of some kind. Typical of these man- 
dates are those for special education in which the Comnissloner of Education 
is required to determine the number of children In each state with particular 
kinds of educational disabilities (Education of Handicapped Children, Federal 
Evluatlon ~ il4l8(b) (1)) . Only some of the statutory mandates for needs 
assessments are referred to as evaluation. In six different programs or 
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levels of programs^ similar data collection mandatas were required without 
using the tem "evaluation." All four major education programs and the 
General Provisions require needs aasessraent of some form by some level 
of govermient. This suggests that needs assessment is recognized as a 
valuable data collection activity^ though it may or may not be referred to 
as "evaluat ion t " 

Five of the forty programs or levels of programs required a '■process 
evaluation*" The statutory language mandating a process evaluation varies 
greatly* Qcamples Include the mandate to special education progrms to 
evaluate the effectiv^ess of procedures totraded "to assure that tondi-- 
capped children receive special education and related services in the least 
restrictive environment * * *" (Education of Handicapped Children — 11418 
(d)C2)(A))s and the mandate to the National Institute of Education to 
"conduct an evaluation and study . . #[to analyse] the m^ns of asaesatag 
program quality and ef f ectivmess" (Vocational Educationj F^eral Evaluation 
by National Institute of Education ~ §2563 (b) (1) (C) ^ also classified as a 
mandate for an outcome evaluation). Four of the five mandates for procesa 
evaluation applied to the major education programs. 

Three of the forty progrffloas or levels of programs dlscusarf evaluation 
In terms of a "coat/banef it analysis." Two of the four major education 
programa and the General Provisions require some form of cost /benefit 
analysis at some lavel of the progrm, toly programs in Blltogual Education 
md Education of Handicapped Children have no such requirement* Typical of 
such evaluation mandates Is the requirMent that state evaluations of 
Title I progr^s detemlne the "effectiveness of parents in toprovlng 
educational attainment" (Title I Programs^ State Evaluations — §2822). 
Two Instances were found which expressed stoilar mandates but which did 
not use the term "evaluation" (Vocational Education^ State Evaluations — 
§2308(b) (2) (B) I Career Education Incentive ProgrMij State Ehfaluation — 
i2613(b)). ^ 

In fifteen of the forty programs or levels of programs , the nature 
of the intmded activities could not be determijied from the context of the 
statute. This lack of specificity was more comnon to the mtoor education 
programs and typically^ took the form of a general stateaent such as, "All 
projects shall include an evaluation component" (Correction Education " 
§3032 (a))* Occasionally the evaluation requirenent was staply listed along 
with a number of other required activities, such as "research and valuation" 
(Educational Proficiency Standards — §4443(a)(3)i Law-related education — 
13002(d)(5)). 

Four of the forty programs or levels of progr^s discussed evaluation 
in terms which did not fit any of the above categories. In two instances^ 
"evaluation was used to tod lea te the need for diagnostic testing or indl^ 
vldual assessment (Education of Handicapped Children, State and Local 
Evaluations — §§1412(2) (C) , 1414(a)(1)(A), 1415(b) (1) (A) ; Basic Skills 
Improvment Program, State Progratti — i2902(d) (6)) . In one tostance the term 
"evaluation" was used to describe the assessment of consequences for education 
programs of ctanging the statutory deftoition of the word "Indian" (General 
Provisions for Educational ProgrMs, Federal Evaluations — il221h(b) (3)) . 
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Types of Evaluation In Different Levela of Govegnment 



Just as the cem "evaluation" cw have dlff«rmt MM^l«ga aezos* 

different education programs, "evaluation" c«n have dlfftrent aeanlngs 
both within and across different levels of governMrot concerned with 
administering a single prograa. For exaaple. apaclal •dueation programs 
required four different kinds of data collection, all deicrlbed as "evalu- 
ation. For the state and local level, the word "evaluation" la used to 
indicate the need for outcome evaluations and diagnostic testing of in- 
(Education of Handicapped Children, State Evaluations 
iil413(a)(7)| 1412(2)(C)| 1414(a) (1) (A) | 1415(b)(1)(A). For the federal 
level, the word evaluation" Implies needs assessments and process evalu- 
ations, as well as outcome evaluations (Education of Handicapped Children 
Federal Evaluation - §§1418, 1425). The federal level re,uir^ents in the 
General Provisions use the word "evaluation" to Indicate needs assessments, 
outcome evaluations, cost/benefit analy.e., one "un.jeclf led" avaluation. 
and an assessment of the consequence, of chancins the legal definition of 
the term Indian. (General Provisions for Mueatlon Pro-raaa. Federal 
Evaluations — iil226c(a)| 1231a(a)(3)| 1221h(b)| 1221(b)). 

Despite the variety of Beanlngs of the ten "evaluation" in the man- 
dates for data collection by single lavel of geverraimt. Mse conclusions 
can be dra^m regarding the kinds of evaluation activities lAlch are coaBonly 
asslnged to specific levels of govermient . Examinini •nly the requirMients 
of the General Provisions and the four major educational programs, one finds 
the responsiblity for conducting needs assesnients and process evaluations 
Boat commonly located at the federal level (Education of Handicapped Children 
Federal Evaluations - §1418(b)(l), 1418(d)(2)| Bilingual Idueatlon, Federal 
Ivaluatlons — §3241(c)| Vocational EducatlM. Federal Evaluation by the 
National Institute of Education ~ i2563(b)(l)i General Provisions for 
Mucational Programs, Federal Evaluations — il226c(a) (2)) . When responsi- 
bility for conducting needs assessments and process evaluations were lo- 
cated at lower levels of government, these duties were not defined as 
evaluation (Title I Programs, Local Evaluations — i2734(b)i Vocational 
Education State Evaluations - S2551(b)). HmmUtm, for wteoie evaluations 
and cost/benefit analyses were found at both the fedwal and state levels. 

2.4 FUNCTIONAL CHARACTER OP EVALUATION; STATISTICAL DESatlPTiaj 
National Level 

One vehicle for understanding what questions have been address«i in 
federal evaluations is to focus on activities of OE's Office of Evaluation 
and Dissemination. The Annual Evaluation Re^rt . Itsi^ by OW, carries 
descriptions of studies complete durlni the fiscal year. Reports for 
1977, 1978, and 1979 were reviewed. The completed mraiuatloo studies 
that were highlighted in each report were elasaif led aecordtoi to the 
questions that the studies addressed. The results are as fellows i 
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Summary of Evaluations 
Completed by OED in Fiscal Yeara 1977-79 

Cksffipleted iiit 1977 1978 1979 

Total Number: 20 18 26 

Percentage focusing prlaarlly on? 

Who is served 60% 44% 46% 

Nature and Cost of Servieea 95% 80% 91% 

Effect on reeiplenta of 

inatruetlonal services 25% 44% 35% 

Coats or benefits of 

alternatives 6% 5% 0 



The percentage of studies mphasiEtag each topic do not add to 100 because 
many studies have multiple purposes* Detailed exhibits are given in the 
Append Ik, 

The main inferences we draw from the table are thati 

(a) Contrary to common views of the Office of Evaluation and Diss^lnation 
most studies have not bem directed at tsttaattog the effects of 
progr^a on their major target groups. (See Chapter 3 on costs). 

(b) The proportion of evaluations with a strong anphasis on examining 
costs or benefits of alternatives is puny. 

The National Institute of Educatlon*s primary mission is research 
rather than evaluation of ongotag progrfflms* The development work on new 
programs engenders activities which could be labelled evaluation, however. 
And the support of work on methods of measuring achlwCTient on designing 
expertaents, and on other topics are pertinent to evaluations undertaken by 
other federal agencies includtag the operattag components of the new, 
department. The 35 contracts and grants awarded by NIE and current in 1977 ^ 
1978, and 1979 were classified to detemine how they relate to questions 
normally addressed in evaluations. The results are as followai 
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Total Number of Contracts and Grdnts 35 

Percentage focus tog primarily oni 

Characteristics of children> totellectlve 

growth J etc. 20% 

Nature of projects or progrMi 23% 

Effects of new projects , program 

ccraaponentSj or program variations 49% 

Detailed information is given to the Appmdto. Agato, most research pro- ' 
jects have several ma to thCTea and so the percmtages do not mvm to 100* 
The major exceptions to the regular grmit or contract research bearing 
on evaltjation are the specially mandated studies such as the NIE Compensatory 
Education Study, the Safe Gchool Study, Project Proptoquityj and the Vo- 
cational Educational Study that is currently underway. These have hmm 
^eluded from the count. 

The Office of the Assistant Secretary for Planntog and Evaluation in 
Education has, despite its title, invested most resources in planntog and 
feasibility studies and to policy Malyes rather than field evaluations. 
For the reports issued between 1976 and 1979, supported by contract, we 
found the followtogi 

Total number of reports: 70 

Percmtage raphaaiztog: 



Deacrlptlve policy analyses 


41% 


Plannljig and feasibility studies 


24% 


Data acquisition and processing 


20% 


Evaluation 


7% 


Teclmical Assistance 


7% 



I^ughly speaktog thm^ the agency most likely to investigate Imple- 
mentation has been the Office of Evaluation and Dlssmtoatlon. Addressing 
questions about effects of programs is less frequrat than address tog ques- 
tions about who is served and how services are delivered. Questions about 
the effects of programs are more likely to be addressed by NIEj but the 
programs examtorf ©re graerally new or expertoental and small. MtE pro- 
duced few reports beartog on evaluation. These reports were produced through 
grants and contracts* 

More generally, if we exmtoed federal law for 29 major programs, we 
find in most of these, the language dsnands evaluation of the effectiveness 
of a progrm to 'teettog the objectives of the statute," In the absmce of 
other toformation we infer that any or all of the questions about who is 
served, the nature of service, and so on, could properly be addressed to i.a 
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agancy-s efforts to Implenent the requirOTant, In a minority of statutes, 
there is a clear stress on assessing impact of the program on a particular 
problm. The provisions in Public Law 94-142, for Jjistance, are more 
specific in asking that infomation be obtained on who is served, the level 
of need for service and the nature of services. The statute by itself does 
not assist us in understanding whether one ought to try to estimate the 
effect of those services on children. In six of 40 major and minor programs, 
there Is anecplicit dmand for needs assessment and in thxmm there is an 
explicit reference to cost/benefit analyses. For most programs, then, the 
language is general and cannot be used to judge which questions can be or 
should be addressed in a field evaluation. The statutes containing more 
specific language are exceptional, and usually concern special studies which 
the Congress wants undertaken. 

Special Studies 

The specially mandated studies are too few to make statistical de^ 
scription useful. Their alms vary considerably but address one or more of 
the questions outlined earlier. Consider the followtag illustrations. 

The Safe School Study, for exmpla, was undertaken by NIE at Congress's 
direction to "detemlne the number of schools affected by crime or violence, 
the type and seriousness of the crimes, and how crime could be prevented, " 
It was mandated as part of the Educational Amendments of 1974 (Public Law 
93»380), The report, completed in 1978, is based heavily on surveys. Its 
specific origins lie partly in Initiatives by Representative Bingham of 
New York and Bell of California and by Senator Cranston of California, 

The NIE Compmsatory Education Study was, required, under the law, to 
examine purposes and operation of the Title I program and to analyze its 
effectiveness* The focus on operations, including alternative allocation 
fomulae was substantial. The origins of the study lie partly in general 
concerns about the absence of good Information on Title 1 performance and 
special interests of Representative Qule's in alternative approaches to 
allocation, judgtag from the heardjigs preceding anactmant of Public Law 
93-380, 

Among special studies initiated at the executive level of DHEW, the 
^^c^^t National Evaluation of the Cities in Schools Program makes it very 
plain that it contains "no impact data, no measures of results" of this 
attrapt to use schools as a base for human services delivery. The federal 
goveriment^s unusual interest in support of this nonagency program and its 
evaluation stems from Interest at the Secretary level, and the emphasis on 
diagnosis and process seens to have reflected that level 'r particular in- 
terests as well. 

Local Level Evaluation 

No major investigation of the kinds of questions addressed in evaluation 
of federally supported progrms at the local level has ever been undertaken. 
In the following ranarks, we use several sources of information to char- 
acterize the activity, 
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UCLA's Center for the Study of Evaluation recently surveyed over 200 
large school districts with research and evaluation units. The Center 
asked about research and evaluation lenerally, rather than about federal 
programs in particular. We have no reason to expect federal and nonfederal 
programs to he treated differently within units, apart from the demands 
engendered by reporting requirements. And so we are willing to assume that 
the allocation of time and priority attached to activity applies to federal 
government. j ff - - 



According to 70% of unit directors, assessing student achievanent of 
objectives is accorded high priority and demands a substantial amount of 
their time. Not all these efforts are tl^ to a particular program, however, 
or the research reports received In UCLA's survey, about 60% refer to specific 
programs. Checklni that implementation conforms to program specification 
was reported to be a unit activity by about 6..% of respondents and ranked 
high for tune commitment by only 21%. Modlfyini programs by using evalu- 
ation results was reported to be a unit activity by less than half the 
respondents, with high time commitment acknowledged by only 15%. The least 
frequent activity of units appears to be comparing costs and benefits ^f~~ 
alternative programs. Only 20% of the respondents acknowledge the activity 
and less than 2% ranked it high for time coimnitment. 

The stress then is on tracking student progress toward goals, and 
to some exteat, assisting in modifying the program. Answering questions 
about costs and benefits of alternative approaches is accorded low priority. 

Our site visits to school districts were not Inconsistent with the UCLA 
survey findings. Very few clear Instances of systonatic work on costs and 
benefits of alternative programs or of program variations emerged. Con- 
siderable attention was dedicated to testing student achievement in the 
interest of observing progress and making comparisons across school or 
district. A good deal of the work on program Implementation Involved In- 
terviewing teachers, principals, parents, and program staffers if such work 
was done at all. Differences did fflierge between districts with strong 
research and evaluation units and those without, primarily in the range of 
questions addressed, production of reports, and sophistication. 

We are aware of only two formal attempts to obtain survey Information 
about how local resources are expended within a school district to answer 
different kinds of evaluation questions. One effort, undertaken by William 
Webster of the Dallas School District and Daniel Stufflebeam of Western 
Michigan University, focused on the 35 directors of research in the urban 
school districts who responded in a questionnaire survey of 60 of the largest 
districts. According to their responses, about 20% of resources are dedicated 
to answering questions about the character of program delivery and about the 
same proportion are dedicated to product evaluation. The ranalning resources 
are dedicated to a wide range of other activities Including managanent 
(typically around 5%), testing (typically 15%), data processing (10%). The 

^nf "f^ f ^^'8? districts with evaluation units suggests that an average 
of 10% of the unit's time Is dedicated to meeting federal reporting or 
evaluation requirements. This Is not unseanly in view of the fact that on 
average the same respondents reported that 18% of the unit's operating budaet 
comes from federal sources re a 
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2.5 THE DECISION TO EVALUATE 

Over the past 10 years, one of the major lessons learned about evalu- 
ation IS that at the national level. It is not easy, not simple, and not 
Cheap. The d^and to evaluate, however virtuous, can be agonizingly diffi- 
cult to carry out. The foUowini remarks focus primarily on the issues at 
tne national level. 

The Questions as Fundamental 

Deciding which questions ought to be answered in an evaluation is 
fundamental. The import of this decision has been stressed in guidelines 
"^ii"^? Congressional staff r.eabers such as Harrison Fox and in 

published papers by staff of Congressional support agencies, such as the 
u.b General Accounting Office and the Congressional Budget Office. It 
has been stressed in public papers by federal executives such as Alice 
Kivlln at CBO, John Evans at USOE, and Michael Tlmpane at NIE It is 
recognised by the large school districts with sophisticated research units 
that we visited, and by the states with strong, if recent, tradition of 
obtauilng sensible evidence bearing on the value of programs. 

The reasons for this attention is that the questions drive all sub- 
sequent decisions. Including deciding how the evaluation will be done, who 
will do It, and how results will be used. Answering questions about who 
is served may require formal information systems created by the education 
agency, or periodic surveys by an independent contractor, or both when there 
IS some interest of gauging quality of the data. Questions abo^ what kinds 
of services are offered may require intensive case studies or surveys 
depending on how the information Is to be used. Questions about what the 
effects or programs are may involve each of these activities simply because 
It makes sense to assure that SOTiebody is indeed served and services have 
an Identifiable character before trying to estimate effects. Determining 
effects of new programs on children and others generally demands more resources 
and planning time if estimates must be relatively unambiguous. An evaluation 
design must be developed and assigmnent of individuals to a program must 
accord with design. If the program is emplaced without attention to evalu- 
ation design, it may not be possible to estimate effects at all. 

The questions that are asked also determine receptivity of audiences 
for results. The numbers of individuals served, the nature and costs of 
services are of interest to many managers and policy makers JudgiiiB from 
Congressional hearings, decisions about budgets, and the like. The tradition 
m the United Statea of trying to understand systematically effects of social 
programs on the recipient of services Is not very long. And so the audiences 
for these results are more difficult to identify, the debate over results 
Is likely to be more vigorous if the conclusions are not pleasant . The 
decisions one can make on the basis of such informr .on will often be 
dabatable. 
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Mechanisms for Deciding WhathCT to Evaluate 

Mechanisms for decidijig whether to evaluate vary considerably across 
federal, state, and local levels. Generally, formal review mechanisms 
prevail at the federal level, and the decision process is less often formal 
at state and local levels* 

At the federal level, the major device for making decisions about 
ongoing prograns Ms been co™ittee review of suggestions about what to 
evaluate and administrative procedures to support review. The approach 
has been takm by the Department of Health, Education, and Welfare, the 
General Accounting Office, and other agencies, though operating character- 
istics of each review group differ across agency, ^ analogous approach, 
involving committee development of a portfolio of evaluation tasks and 
review by the legislature, has bem taken by tollfornia, according to 
Alex Law of the State's Department of Education* A few well developed 
research and evaliiation units at the local level have committees to assist, 
review, or oversee evaluation planning, e.g., the Dallas School Board *s 
evaluation planning co^ittee. 

Until 1980, withta the USOE, the Evaluation Planning Group made de- 
cisions, withta limits Imposed by law and resources, about whether a program 
should be evaluated. No minutes of the meetings of this Group are available 
and no one outside government Ms nomally been present. But the operation 
of the Group is traceable partly through its product, an Evaluation plan 
for three fiscal years which has been prepared aimually, and interviews* 

The first step in the process has involved annual request for sugges- 
tions, made by the Assistant Secretary for Education and made to the 
Co^iasioner, Deputy Commissioners and Executive Deputies, the Directors 
of the Office of Evaluation and Dissanlnation, and others. Kie Education 
sector activities, vested in the Evaluation Planning Group, fit IntQ this 
large fr^ework for evaluation generated by the IMder Secretary of Health, 
Education and Welfare. Guidelines developed in 1978 and issued by the 
Under Secretary cover evaluation, research, and statistical activities and 
are detailed. 

Within the Office of Evaluation and D amlnation, the response to the 
request involved specification of the progr^ and project for which an 
evaluation is thought necessary, and the focus and purpose of the proposed 
evaluation. 

The criteria set out to guide the submission process has included 

. Expiration dates for legislation bearing on the progrms 
and expected period in which hearing could capitalize on 
the information (12-18 months before new legislation). 

, Programs with high priority but which had not been evaluated 
earlier on account of limited funds. Priority has been 
determined by the needs of progr^ managers, the Interests 
of Congress, Ohffl, and the general public. 
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* Programs in which avaluations are obgolete or otherwise 
iJivalid. 

The development of a list of candidate evaluations has been an iterativa 
process within OED as prioities^ available resources ^ and other factors 
are discussed* The Evaluation Planning Group, chaired by OE's Executive 
Deputy CoTOissioner for Resources and Operations, has consisted of evalu- 
ation and policy officials. The results of the effort is a Proposed 
Evaluation Plan for the next three years, describing the character of 
the work and its costs. Ftaal approval has been made at the Secretary 
level . 

A major difficulty identified by agency staff familiar with the 
process is that the volmne of work at the Assistant Secretary level has 
been high and the time frame too short to adequately assay the conse- 
quences of a decision. Moreover, the process is alleged to have been 
bureaucrat Ically cumbersome* Part of both problCTS may be reduced with 
the creation of the new Department, We iinderstmd that a new mechanism 
is being developed* 

Alternate approaches have been tried* For esraple, in the Congress- 
ionally mandated Compensatory Education Study with funds earmarked for 
evaluation, this coimittee structure was imaterial. But its equivalent 
had to be set up within the Study group to determine which aspects of 
the progr^ or project may deserve attratlon. The "equivalent" amounts 
to a loosely defiji^ group of individuals which Include project Bta^i and 
Congressional staff manbers with sufficient interest and ability to Mn- 
fluence the nature of questions being addressed* 

There is no regular procedure for considering whether a particular 
evaluation is worthwhile and the sense In which it may be worthwhile in 
the Congress or its support agencies* For any particular program, the 
procedure is rarely formalized in law. Rmarkable exceptions toclude the 
recent NIE Compensatory Education Study* Mong other '-r^ni ir^ents the 
legislation mandating that Study asked that the evalua .11 plans be 
submitted for Congressional review. 



Evaluabllity Assessment 

Evaluabllity assessment is a formal procedure developed over the past 
10 y^rs at the Urban Institute to facilitate the process of deciding 
whether to evaluate and the sense in which evaluation is possible* It 
asks that one first systOTatlcally "define a program in terms that agree 
with the manager's or policy makers' intentions," so as to permit sensible 
judgments what Information one ought to collect and at what level of detail. 

A major part of the exercise Is a specification of who will use the 
information and how it will beised, the level of Information needed to 
take action, and the expected impact of testing the assumptions underlying 
the program. The process involves considerable Interaction with manaeers 
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or policy makers to establish what the proiram is supposed to do and a 
model of how it is supposed to do It. Subsequent analysis is designed to 
understand whether the program is sufficiently unamblguDus to make evalu= 
at ion useful^ 

The process makes explicit what others do informally. NIE's Compen= 
sacoty Education Survey, for instance, involved an intensive effort to get 
at similar kinds of information from manaiers and Congressional staffers 
before the actual evaluation was initiated. In particular, there was a 
formal effort to understand and document major features of program oper- 
ations, notably federal, state, and local relationships, which were poorly 
understood at the time. It Is clear that evaluabllity assessment is useful - 
however, and ought to be regarded as a legitinate procedural option in ' 
understanding whether and how to evaluate. 

Until 1979 or so, most evaluations undertaken at the national level 
were preceded by informal, rather than formal, evaluabllity assessment. 
Current interest in making the process routine is reflected in recent 
activity of the Office of the Assistant Secretary for Evaluation and Pro- 
grain Management. A unit within the division has issued Requests for 
proposals for about ten independent assessments and conducted several In- 
house. The in-house efforts have Included an effort to better understand 
S 1 , ^"^f"^ evaluation money could be expected to do much good in 

iollow Through and an assessment of the Cooperative Education Program. 

The tentative policy of the new Office of Evaluation and Program 
Management lodges responsibility for evaluabllity assessment with the 
Division of Program Assessments, one of the units in the Management 
Division of the Office. Program assessments will include short term 
studies of three types.- evaluabllity assessment, service delivery assess- 
ments (SDAs) and progran audits. SDAs are similar in intent and practice 
to activities undertaken earlier by the Inspector General's Office in DHEW. 

The Legislative Decision to Evaluate 

_ There is no unique mechanism for deciding when to demand evaluation 
in Congress. Rather, the demand to evaluate st«ns from the normal process 
of decision-making. So, for Instance, the Senate Committee on Appropriations 
has, during hearings, asked that OS staff obtain evaluative information for 
Jaw ^t^-^ hearings. More formal requests that are incorporated into 

law stem at times from intensive deliberations, as In the case of the 1974 
mandate for the Nil Compensatory Education Study (Elementary and Secondary 
Education Anendmmts). The requests are, at times, made law without much 
debate or specification, as in boilerplate requlr™ents that the agency 
evaluate to determine whether the program is meeting purposes of the statute. 
The statutory requests may be influenced by GAO Investigations, or by CBO 
policy analyses. To the extent that evaluation involves only case study 
then perhaps this diversity is warranted. The pj-oblem of course is that 
level conmltment may be much greater, and justifiably so. The point is 
that origins of the demand are diverse and that there is no special mechanism 
for review of demands at their source P lai mecnanism 
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The absence of a special mechanism has several Jjaplicatlons, Firsts 
it means that anyone with an interest in meeting Congressional demands 
can refer to no Gentral guidance on meeting the demand well. This implies 
the process will be cumbersome and will demand a fair amount of interaction 
betwera Congressional staff and agency staff* It mems that Judgments about 
what can be undertaken after the fact* To the extent this process Is not 
undertaken quickly 5 the opportunity for confusion increases. 

To clarify the legislative decisions one may eKamlne Reports ^ Hearings ^ 
and so on. These are informative for major programs and evaluations. But 
they are very terse and imiaterial for many others. Moreover, there is no 
special mechanism to clarify decisions. The direct implications are that 
conversations about what the dmand means between agency staff and Congres- 
sional staff are episodic and productive at best. At worst, they are 
entirely absent. The infrequency does not foster trust or at least in-= 
formed skepticism necessary for a working relation. And it can lead to 
unnecessary suspicion. For example, we understand from Congressional staff 
mesber's public r ©narks that there was notable suspicion about a contractor's 
investigations of alternative allocation fonnulae in the early days of the 
Sustaining Effects Study, That suspicion was produced at least partly by 
unf amillarity of staffers and contractors with each other. More dismal is 
the case of no communication between campm resulting in post facto criticism 
which may or may not be warranted. The absmce of special clarifying mech-- 
anisms does nots we believe^ help the bureaucrat respond in a timely fashion, 
simply because a more routine system of review also constitutes a rminder 
system. 

Sources of Confu sion. Sources of confusion are nmnerous. They include, 
we are told, the problm of educing whether the program is a new educational 
exercise whose effects must be determined or a civil rights mandate. The 
Bureau of Education for the Handicapped, for instance, regards Public Law 
94-142 as a civil rights mandate: access to a free public education ap- 
propriate to their own unique needs. One can argue then that establishing 
that service as delivered then is a sufficient evaluation. Some educators, 
on the other hand^ see the same law generating an education problem in that 
no one is well prepared to serve such children, Thisj in turn, may imply 
that estimating the effects of the program on children. 

The illustration points up a more general source of conflict. There 
are a variety of views among federal agency and Congressional staff about 
whether the Congress is interested in estijnating the effects of programs on 
the program's primary target groups, usually children, A few of the Con- 
gressional staff were CTiphatic in their view that Congress is dis interested, 
And at least one survey, by Florio and others, seems to bear that out. 

The Florio et al study of 26 Congressional staffers in 1978-79 asked 
for ratings of the kind of information in whlcli they were most interested. 
Information about effects of the program on Individuals, institutions ^ and 
agencies clearly had highest priority. Costs, demographics » and opinions 
were clustered well below in that order. 
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A few agency staff were equally mphatlc about Congrassional dis- 
interest in the topic, mamtaining that the primary focus should be 
''understanding where the money goas and who gets served Complicating 
the problm is the question of how much evaluation in management's interest 
should be tacked onto an evaluation designed to satisfy Congressional re- 
quireamt for information. The NIE Compensatory Education Study put 
management a poor second to tongressional interests. 

The fact that statutory dmands evaluations can be spawned in Con- 
gressional support ag^cies such as CBO and GAO, by Coimittees, by in-- 
dividual members of Congress, and others is as we've said a potential 
source of confusion. The diversity also serves as a rich source of ideas ^ 
and It is difficult to sea how confusion can be reduced while maintaining 
diversity. 

And of course there is lots of vagueness about decisions which might 
stem from the quastions one addresses in our evaluation. In principlej 
one could specify what ktads of decisions would be made based on the in- 
formation one accumulates. In practice, that specification is difficult, 
if not Impossible because (a) insufficient time is allocated lay out de- 
cision options, (b) the nature of decisions cannot be specified well before 
the information is collected, (c) the decision options may change independent 
of the evaluation^ or (d) no one is willing or able to specify decision 
options. 

All this means that the time and effort required to clarify a simple 
mandate to evaluate can be dmanding. Consider , for instance, the problan 
of developing an evaluation plan for evaluating administration of Public 
Law 94-142 J on providing access to free and appropriate education for the 
handicapp^. Some 18 months were required for the task and were permitted 
by virtue of the fact that the law becme effective two years after enact- 
ment. According to Garry McDaniels and Mary Kennedy of BEH, the questions 
were made explicitj modified durtog the course of planning on the basis of 
interviews with members of advocacy groups, federal agency staff, and state 
agency staff. This effort and the law's general reference to evaluation of 
administration of the progrmi resulted in questions bearing on the extent 
to which intended beneficiaries are served, the setting and types of service 
provided, administrative mechanisms in place, the consequences of Impl^entljig 
the law, and the extent to which the statute was met. Similarly, complicated 
negotiations took over six months in the NIE Compensatory Education Study 
of Title I. 

Specifying Decisions^ We have found few formal attempts to specify al= 
ternative decisions which could be made by any federal agency or by 
Congressional staff on the basis of a given planned evaluation. The ex- 
ceptions are so-called evaluabllity assessments which do try to address the 
question of how infoifmation once obtained will be used. This precursor 
to a fomal evaluation is demonstrably feasible at times, whm the audience 
consists of managers in an agency. It is not clearly feasible in the legis- 
lative arena. But apart from recent studies by the GAO, it does not appear 
to have been tried out often in this context. 
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Regardless of whether decision options can be specified, there is 
strong disagreOTent about what the information Implies, Considers for 
exmple, a progrffla found to have failed on most counts in meeting Its 
obj actives. At least one camp withta the federal executive branch takes 
the position that because it failed, more money ought to be put into the 
program to make it succeed, A second camp will argue for its teraination 
beeauie the program has failed. Still a third camp will make the decision 
one way if the progrm is a demonstration project and the other if it was 
created as a service program. Complicating the matter is that similar dli- 
agreements are evideit among Congressional staff. Regardless of whether 
decisions are specified^ regardless of disagreaaent over taplications of 
the data, the evaluation forms only a part of the ^formation obtained on 
any program, Oth^ information may carry considerably more weight. 

This Project has not eKamin^ the decision processes carefully - our 
mission was to attend to a variety of other topics, Thematter is pertin- 
ent here in two respects. Without prior specification of what decisions 
are possible if particular results emerge, determintag subsequent use of 
results will be more difficult and may be topoasible. Without prior sped-- 
fication it is considerably more difficult to design evaluations so as to 
be "relevant," 

One peculiarity of pur interviews with Congressional staff was re-- 
luctance of a few of thaa to talk to contractors who are responsible for 
executing evaliiations. The point is pertinent here that it may be 
necessary for contractors to verify evaluation goals independent of the 
federal agency. The reluctance was mild but vague: "A 's sends contractors 
around to talk. . ,1 haven't got a lot of time. . .1 am not licensed to 
talk to contractors by my committee chairman." The respondent who needed 
licensing gave the same reason for not meet tog with other staffers, agency 
or Congressional, We do not believe this is a serious problra, but we 
havm't talked to a large nimber of staff mCTibers. If it is serious^ then 
the prospects of clarifying objectives of evaluators are dim. 

Mechanisms for Clarifying Dgsands and Decisions , At least one CBO staff 
mCTber, believes that agency staff meabers do not spmd enough time talktog 
with Congressional staff, and that more time is necessary for building good 
evaluations by guiding better understanding of the opinions and views 
espoused by both groups* Similar suggestions were taken publicly in recent 
meetings of the American Educational Research AsBOclation by Congressional 
staff member Jolm Jennings, and agency executive such as John Evans and 
Carl Wialer. The same spirit of exchange emergtiti in Jjiterviews with staff 
at the Assistant Secretary level. 

The flaws the system mean that evaluations are somet Imes not 
timely mnd oftm it is not clear whether evaluations will be timely or 
not. Untimely reports is not typical. But the problCTi occurs often 
enough to justify concern by Jennings, The examples Jenntogs cites Include 
a report on children of Title I migrant workers, which arrived too late 
for use in reauthorization and a request for proposal for evaluating Title IV 
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while the program was scheduled for reauthorization in the same year. The 
flaws also mean confusion over what Is intmded by law and how the agency 
interprets the law and the mandate to evaluate. In public rOTarks at 
professional association meetings, Jennings cites agency confusion over 
how the toergency School Assistance Act works and Congressional suspicion 
over the definition of income by a contractor examining Title I programs. 

The elements of an Improved practice appear to include: 

Ca) regular meetings among legislative and agency staff to 
make decisions about when evaluations are warrantedj to 
determine broadly how evaluations should be carried outj 
and to report on progress; 

(b) an information system which will make access to previous 
related evaluations simpler; 

(c) participation by technically toowledgeable staff as well as 
political staff in discussion* This includes^ for instanca, 
knowledgeable staff of the Office of Evaluation and of 
pertinent Division of the U.S* General Accounting Office, 

as well as CBO and the Congressional staff, 

(d) a planning system which matches production of evaluative 
reports to the budget cycle, 

(e) planning time. 

Soma efforts were made in 1978-79 to rmedy the problen of faculty 
communication^ but without much success* More recent efforts includa 
maetings at the Deputy Assistant Secretary level with Congressional staff 
to lay out plans for evaluation. 

We believe this intention is sensible and ought to be vigorously 
implemented. 



Footnotes 

In this chapter and all others, full references to the documents cited 
are given in the reference list. Chapter 8, The text citation includes 
an individual author where possible and the Section 8 lists documents 
by author. Where acknowladgament of individuals is not possible, the 
text identifies the organisation that produced the report and the reference 
list entry idantlflas the organization as author. 
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CHAPTER 3, HOW ARE EVALUATIONS CONDUCTED? 

David S, Cordrays Robert F, Boruch 
and Gaorgine Pion 



Always be suspicious of data collection 
that goes according to plan. 

In Patton, 1980. 
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3. HOW ARE EVALUATIONS CONDUCTED? 

This chapter describes how evaluations are carried out at local, 
state and national levels of government. It Is organized into five 
sections. Section 3.1 describes the basic elements of what we believe 
constitutes good evaluation research practice. To a certain extent, the 
law and regulations play a role in guiding evaluation practices 
^L^f^^ ^^^^^ °^ government. Section 3.2 describes the explicltneas with 
which the law prescribes evaluation methods. Section 3.3 examines similar 
Issues pertaining to federal regulations. 

Dapandlng on the amount of discration, resources and capabilities 
of an agency, federal, state, and local evaluations may exceed the require- 
ments specified by the law and/or regulations. Consequently, describing 
the way programs are evaluated at each level requires an examination of 
the factors that contribute to evaluation practices beyond the requirements. 
These factors are described In Section 3.4. The last section, 3.5, takes a 
broader look at the type, scope and execution of numerous national level 
evaluations. This section provides brief illustrations of the procedures 
used in federal evaluations. 

How well evaluations are performed depends on this material and on the 
capabilities of those who are responsible for their completion. The topic 
is discussed In Chapter 5. 

3.1 ELEMajTS OF m EVM^UATION 

To avoid some chronic misunderstanding here, and make plain what we 
believe are sensible steps in an evaluation, we describe the elements 
briefly. The elements are desirable, in principle, judging from guidelines ■ 
issued by professional organizations, by Congressional support agencies 
such as the General Accounting Office, and by federal evaluation agencies. 
The elements Include: 

. Deciding to evaluate and specifying the questions to be addressed 
by evaluation. 

. Designing the evaluation. 

. Deciding who will carry out the evaluation. 

. Conducting the evaluation and pertinent side studies. 

. Analyzing results, making recommendations, and reporting. 

. Evaluating an evaluation. 

. DissMlnation and use of the results. 

They are not always a matter of practice. 

Decision to Evaluate and the Questions J^o be Addressed 

The decision to evaluate may be datermlned, as In the case of Con- 
gresaionally required Htudies, or an agency may have some discretion in 
the matter. In either case, the decision should hinge on what questions 
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ought to ba addressed in an evaluation, whether anyone Is interested in 
using the answers, whether it is possible to answer the questions in a 
fair and timely way with the resoLrirces at hand. If the decision is made 
by a legislature J the problem of defining questions is less often resolved 
by the legislature than by the bureaucracy asked to handle the evaluation t 
At the federal level and in some states, the decision may be articulated 
by a formal committee and checked agains t j udgments of the legislature . 
If the information available for decision is insufficient and someone has 
the wit to recognize the fact, exploratory studies should be undertaken 
to obtain it. The actual form an evaluation may take depends Heavily 
on the Information available in making this decision* 

Design QfEvaluations 

The design of evaluations depends heavily on the preceding element i 
it makes no sense to design an evaluation unless one knows what information 
is wanted, by whom, for what purpose. In the ideal case, questions are 
refined at the design stage of evaluation and^ typically, the technical 
solutions to problems about how to obtain the information * at what level of 
detail and quality, at what cost, will be laid out. Also, in the best cases, 
the design stage will Identify solutions to probable managerial, political- 
institution, legal or ethical, and scientific problems engendered by the 
need to evaluate* Also at its best, the process Includes a review of earlier 
work on the topic: a literature review and conversations with those who have 
had a hand in producing that literature * 

At the federal levels and in some states, law provides a general 
framework, and the task is articulated by pertinent government evaluation 
staff and, ideally, relevant legislative support staff. The more specific 
details are worked out by a contractor or agency staff responsible for 
specific design and actual execution of the evaluation. 



Deciding who will carry out the Evaluation 

Numerous classes of individuals may be designated as "evaluators," 
Exactly who is designated depends on the nature of the program, the level 
of government within which the evaluation activity Is undertaken and the 
requirements imposed by law and regulations. These issues will be dis- 
cussed in subsequent sections of this chapter and in the next chapter. 
Since selecting an individual outside the agency represents the most complex 
type of decision regarding who will conduct the evaiuatlon, we focus here on 
those aspects associated with the contractor model of evaluation. 

At the federal level, in educatlen, all but a very small fraction of 
evaluations of ongoing programs are executed through competitively bidded 
contracts. A similar competitive bidding process is used in some states 
and at their best such states provide guidelines to local education agen- 
cies vrtilch use contractors as well, A requeet for proposal is Issued, 
At its beat, the request aska that the contractor submit alternative 
evaluation design plans and their j ustlfication if the design elaborated 
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in an RFP Is not regarded as f.snsible^ The request should elicit funda- 
mental information such as who will be involved » at what level of activity, 
at what coat, in what time frame. Milestones Tnay be specified by either 
the contractor or the agency issuing the request. In the ideal case, 
review of proposals is based on eKpllcit standards, laid out in the re- 
quest, and is conducted by people who are well infomed about evaluation 
from both inside the government agency and outside lt» 

Conduct of the Evaluation 

This part of the exercise detnands managerial, technical, and a variety 
of other skills to put plans into effect . At its best, this stage recognizes 
that not all problems can be anticipated and provides for side studies in 
budgets and resource allocation. At Che national level, the demands of 
advisory agency such as CEIS, and of authorizing agents, such as FEDAC, or 
their equivalent must be met. In the ideal case, tentative clearance is 
given automatically and the groups actually provide useful information 
about the project. This stage, regardless of government level, usually 
involves coordinating information collection from diverse groups, school 
districts, for instance, public relationa, management and quality control 
of information collection, coftsolidating information, liaison, budgeting 
and other tasks. In the simplest case, staff will be readily available 
and their capabilities transparent. In the more typical case, staff will 
have to be recruited by the project and there is a clear need to develop 
strategies for accoiranodating eKpected but unpredictable incompetence in 
at least a few Individuals. This holds for both staff, advisory board 
mCTbers, and government staff with whom one deals. For evaluations in 
new arenas, pilot tests of the entire evaluation process are warranted in 
education as they are in health services, for instance. 

Analysis, Reporting and Recoimnendations 

Collecting Information represents one phase of the evaluation process 
Ensuring that the information is of high quality, reliable and valid are 
additional technical requirements of the process. Synthesizing the infor» 
mation so that it addresses and answers the evaluation questions represents 
the primary funGtlon of the analysis phase. Assuring the integrity of the 
conclusions derived from the analysis often requires careful examination 
and review of competing explanations. Providing additional evidence and/or 
a rationale for the integrity of a conclusion is often warranted in order 
to ansure effective conmunicatlon and ward off unnecessary or Incompetent 
criticism. Until recently, contractors at the federal level have not always 
been asked to provide recomiendations, rather policy recoiranendations were 
derived by agency personnel. It is reasonable to expect Independent sets of 
recommendations to be useful, providing an opportunity for input from multiple 
diverse sources. 

Reporting in the best of cases has directed specific messages to specific 
audiences using a uedium and style suited to each audience. In the ideal 
case, the clearance or review of reports is brief, thr process does not prevent 
the report from being timely and does improve the quality of the report. 
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Evaluation of Evaluations 

Although proper conduct of evaluation research places a pramlim on 
routine examination of the results for the presence of alternative explan- 
ations | for appropriate application of analysli procedures and for the 
validity of conclusions, these practices are not alwayi conducted by the 
evaluator. At least one independent review and> possibly ^ reanalysis is 
warranted in many cases. Such reviews can Increase the quality of the 
evaluation effort by (1) pointing to additional explanations which should 
be considered, (2) identifying questions that went unanswered In the 
original analysis and (3) clarifying the meaning of ambiguous aspects of 
the report. 

Dissemination and Use of Results 

Whether results of evaluation can be used and are used depends partly 
on what the evaluation questions were to begin with, who the audience^ 
are^ and on incentives and ability to exploit results* The organization 
responsible for carrying out the evaluation may also apply results, if it 
is linked with staff of the program under evaluation. If the evaluator 
is independent, as contractors are, for example, application will usually 
be the responsibility of others* Tracking the use of results is difficult 
regardless of who is responsible* That Is, determining whether views are 
changed as a result of evaluation, whether specific decisions are made, 
and so on, are not easy* Ivaluation forms only a part of the information 
available to Inform any decision and separating its influence of the 
evaluation from the influence of pressure groups. Individual intuition, 
and the like may be impossible. A system for routinely documenting utili- 
zation simplifies matters. This element of evaluation and the problems 
in execution are not much different, in principle, at local, state and 
federal levels. 



Obstacles 

The obstacles to performing any of these tasks at the local, state, 
or federal levels can be broadly classified Into four problem areas. 
Managerial problems include assuring that staff are available and capable^ 
that orgaiilgatlon is sufficient, that cooperation from the often large 
number of groups whose cooperation Is needed. Is available, that time and other re- 
sources are sufficient* Political-institutional issues include acconinodatlng 
or circumventing pressure groups, satisfying legitimate Interests in short 
term and long term information. Scientific problems include assuring that 
the evaluation design Is technically adequate, that the conduct of the 
evaluation accords with the design, that one can educe the implications of 
inevitable deviations from design, that one can sensibly analyse partially 
reliable data, and so on* Legal and ethical problems may include assuring 
privacy of the individual and confidentiality of response assuring due process 
in evaluations which must control assigmnent of the individual to program 
variations, and other matters. These difficulties are discussed in most good 
texts on evaluation. Research on alternative solutions to the problems is 
often part of major evaluation studies t but Independent work is supported 
by federal agencies such as Nil and NSF* 
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Evaluatlon_CQntexti j_ State Administered ver^eus Direct Grant Programs 

Our disauasion of the eallent elements of an evaluation is idealized 
if one considers the evaluation process at different levels of govermnent 
and across programs. There are at least three basic types of programs that 
can be differentiated according to the allocation process and level of govern* 
ment responsible for their selection and execution. These arei Direct grants 
awarded by the federal government to local and/or state education agencies ^ 
and two types of state administered granta—basic grants to LEAs (e.g*, 
Title I) and special projects awarded by the state on a competitive^bid 
basis (e*g., Title IVc) . 

Evaluation is required for all programs but the relevance of various 
aspects of the evaluation process differ # Evaluation within the content 
of direct grants to LEAs (e.g., Bilingual Education) entail all of the 
elements described above. Evaluation of state administered programs (e.g. , 
Title 1) entails more direction from the federal government regarding how 
the evaluation is to be conducted , what is to be measured, how results are 
to be reported p and when they are to be reported. In this senstj the 
evaluator has less discretion over the evaluation process. Instead, the 
evaluation becomes a matter of first fulfilling the mandated requirements 
and then tailoring additional activities around these required activities " 
especially when required information does not meet the needs of those 
individuals at each level of government. In considering how evaluations are 
conducted p it is necessary to consider these differences in program operation 
and funding. In accounting for variations In the types and quality of evalu-^ 
ation practices across agency levels and within the same level (across pro- 
grams), federal regulations play a central role. Further, there are atate^to^ 
state differences in evaluation requirements due to state funded educational 
programs. These differences influence the nature and scope of evaluation 
practices at local agencies. Finally, within a given state, differences in 
the quality of LEk practices depend upon local capabilities, interest and 
resources * 



3.2 PROCEDURES FOR EVM.UATION SPECIFIED BY LAW 

Some sections of the law demand that evaluative questions be addressed, 
but those statutes do not mention the word evaluation explicitly. Other 
sections mention the term evaluation but offer little guidance as to what 
Is required, A minority of statutes are explicit as to what evaluative 
procedures are required. The occasional statutory references to methods, 
however, are interesting for their appearance at all, and suggest that it 
is possible to specify method in more detail 4f Congress wishes to do so. 

In a few instances, the statutes address particular aspects of the 
mandated research* The broadest example of such an instance is the re-» 
quirement in the General Provisions that the Secretary, in an annual report 
to Congress covering all federally funded educational programs, set forth 
"goals and specific objectives in qualitative and quantitative terms" (e.g., 
General Provisions for Educational Programs i Federal Evaluations — 11226c (a) 

(DCA)). 
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The most common specif icatldn for Individual programs is the re- 
quirement that the evaiuatione employ "objective" measures or criteria 
(e*g». Title I Programs, Local Evaluation — §2833 ; Federal Evaluation ~ 
i2833(f)| Adult Education ~ |1207a(a) (1) (B) | toergency School Aid ~ 
13200 (a) (11) I Dropout Prevention Programs — 13387(b)(3))* The mandate 
for one program provides a list of the specific measures to be Included 
in the evaluation. Including "qualitative assessments by teachers and 
professors s cumulative grade point average * SAT scores , acceptance to 
colleges and universities , college grade point average i and college major" 
(Biomedical Sciences Program " S3054 (a) (11) ) . Other mandates specify 
certain characteristics of the measures » such as emphasizing that the data 
should be comparable on a state-wide and nation-wide basis (e*g*| Title 1 
program^ Federal Evaluation — 12833(f)). 

A few statutes place certain restrictions on the research design* 
Two mandates for evaluation specifically require longitudinal studies 
(Title I Programs, Federal Evaluations — 112833 (f)* Biomedical Sciences 
Program — 13054(a) (11)) , Other mandates offer suggestions cDncernlng the 
nature of the control groups. One suggests a no-treatment control gi^up* 
"composed of persons who have not participated In such programs or projects.. 
(Emergency School Aid — |3200(a) (11) ) . Another Is quile explicit In re- 
quiring an evaluation "to compare the extent to which graduates and drop- 
outs of vocational education programs (1) find asploj^ent In occupations 
related to their training, and (2) are considered by their employers to be 
well trained and prepared for employment 5 eKcept that in no case can pursuit 
of additional education or training by program oompleters or leavers be 
considered negatively In these evaluations* . ."(Vocational Education^ 
State Evaluations ™ 12312(b)). Another statute provided the option of 
conducting "not more than three eKperlmental studies" to achieve the 
purposes of the evaluation mandate (Vocational Education^ Federal Evaluation 
by the National Institute of Education — 12563 (b) (1) (D) ) * 

Other methodological issues are addressed less frequently. Two 
statutes comment on the need for "statistically valid sampling" or "random 
sampling" In selecting the participants or programs to be Included In the 
evaluation (Vocational Education, State Evaluation — §2312 (b); Career 
Education Incentive Programp Federal Evaluation "|2613(c)). Mother 
addressed Issues of generallzabllity by requiring that the evaluations 
determine if the programs have "achieved goals and are capable of achieving 
aomparable levels of effectiveness at additional locations" (Adult Education 
il207a(a)(l)(B)). 

Wien warranted, explication of the type of measures, design and other 
conditions. In the law, clearly serves to focus the evaluation comniunity 
on the Inforaational needs of the Congress. If It Is the case, for Instancei 
that for a specific Issue, the law maker believes that the use of a statisti- 
cally valid sample of schools, districts, or students will provide sufficient 
Information for their purposes, it seems sensible that the use of such a 
procedure should be written into the statute. Not only does the evaluator 
receive useful guidance as to how to proceed, there is a certain amount of 
uncertainty reduced as to the scope of work that is required and a sense 
of what the audience of the report expects to receive* 



71 



3-7 



This degrea of expllcltness is often not warranted for a variety of 
reasons. In some cases, the feasibility of conducting a specific type of 
evaluation is unknown and the datails are deferred to the agency responsible 
for carrying out the evaluation. In other instances (e.g.. ISIA Title I) 
the law requires the development and implementation of models for estimating 
program effects but the SEA and LEA personnel are provided with some dis- 
cretion as to which model they will follow. 



3.3 ADMINISTRATIVE PROCEDURES i FEDEBAL REGULATIONS 

at obtaining inforlatlon from LMs anJ SM^"Th''^' ^"^^'^^^ primarily 
primarily directed at national TTIi f % ^ °^ evaluation is 

contractors or he fLeLl agenclL '?h ?° conducted by independent 

to be conduced is .ypicalir^iderby'f ^dSL^rfgula^ '""^ '''^ 

Federal Reporting Requirements 

Pfograms differ markedly with respect to the number and types of 
evaluative mechanisms that are described within the law and by federal 
regulations. To illustrate the variety of factors that affect the way 
evaluations are conducted, we discuss four major educational programs. 
These are: ESEA, Title I (Education of Disadvantaged chUdreni basic 
grants to LEAs) i ESEA, Title VII (Bilingual Educatloni basic grants and 
demonatraclon grants); Special Education for the Handicapped (P. L 94-142 
Part B); and Vocational , Education (State grants and discretionary programs). 
These programs were selected because they are diverse with respect to their 
administration and organization. 

This type of analysis Is particularly Important In that it lays the foun- 
dation for assessing whether consistent procedures are mployed and for isolatinE 
where corrective action may be warranted, particularly if structurally different 
programs with similar Information requirements reveal consistent weak areas 
(e.g., data quality). 

ESEA, Title I (Educat ion of Disadvantaged Children. Basic Grants to 
LEAs) " — ~ — — ~— — — — 

<a) LEA Evaluation Requlrementa . The 1974 and 1978 Educational 
Amendments require the Commissioner to develop and make available to SEAs 
and EEAs (through the SEA) explicit standards and "models" for evaluation 
reporting at the local level. The October 12, 1979 Federal Regist er 
describes these standards and reporting regulations! every LEA receiving 
funding is required to submit an evaluation plan to the SEA that addresses 
how it will meet technical requirements of the regulation. At least once 
every three years, the LEA must evaluate its progsams using "reliable and 
valid instruments, "procedures that minimize error" and a design that 

yields a valid assessment of achlvement gains." This latter requirement 
can be fulfilled by using one of three federally developed models or a 
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eultabla alternative approved by the SEA and Commissioner* Each model is 
suppoaad to provide an estimate of the effect of receiving Title I services 
on student performance compared to an estimate of what perfonQance would 
have been in the absence of Title 1 services. Achievement scores are to 
be reported to the SEA using a comnon measures a "normal curve equivalent" 



The new regulations also require longitudinal assessment to ascertain 
whether Title I gains are sustained after services are withdrawn. This 
assessment is for local use and reporting is not required unless requested 
by the SEA. Initial achievement status and galni a description of the 
assessment process and project inforaatlon are the only federally mandated 
evaluation requirements that are Imposed on LEAs* The project information 
that is to be obtained includes i average duration of Title I service^ 
pupil-per-^teacher ratios, expenditures per child, and number of participants. 
According to the regulations , this project Infomatlon Is to be collected on 
sample of grade levels. 

(b) The SEA Evaluation Requirements # The SEA Is charged with the 
responsibility for ensuring that the LEA educational plan is In compliance 
with the law and recently , this role has been eKpanded to include more 
eKtenslve evaluation functions. SEAs are responsible for monitoring how 
the projects are carried outj providing technical assistance regarding LEA 
evaluation and aggregation of LEA data. Thm monitoring function is carried 
out through field visits by state Title I representative (s ) . The state 
receives one and one-half percent (set^aslde) of its total allocation^ or 
$150,000, whichever is greater, to perform these functions* 

The SEA compiles the data that is submitted by the LEAs and submits 

(1) an annual j erf pnnance report , containing i The number of participants 
served by types of service; nimber of participants .by grade level for public 
and nonpublic recipients and "other information requested by the Commissioner 
and (2) a biennial evaluation report , summarising Information for all or a 
representative sample of LEASi 

(c) Federal Evaluation Requirements . Section 183 of the 1978 
Education Amendments clearly delineates the evaluation tasks and priorities 
to be addressed by the Coimisslonar- The law makes provision for two levels 
of evaluative evidentei independent evaluations designed to "describe and 
measure the impact of programs" and the Provision of Technical Assistance 

to States and local agencies on conducting evaluations. A maximum of one*" 
half of 1 percent of the amount appropriated for these programs is provided 
for evaluation and priority is to be given to the federal assistance to 
state and local agencies. 

(2) ESEA, Title VII (Bilingual Education) Evaluation Requirements 

(a) LEA Requirements (Basic Grants to LEAs) « Unlike Title I, these 
programs are direct grants awarded to LEAs or Institutions of Higher 
Education which apply jointly with an LEA. The guidelines for the evaluation 
plan appear in the rules and regulations ( Federal Register , Vol* 45, (67), 



(NCE) . 



April 4, 1980). 




As part of the applleatlon process, the grantee is requested to 
Bpeclfy^ perf ormanee objectives and an evaluation plan. The proposal 
review procedure specifies that each propoial Is rated according to 
110 possible points* The specification of objectives and the evaluation 
plan are each allocated 15 points. The Regulations specify that the 
evaluation plan Is reviewed for evidence thati 

(1) The overall evaluation plan is consistent with the Instruc-- 
tlonal training objectives ; 

(2) Adequate attention is paid to (a) the assessment of all 
objectives, (b) data collection Instrtmeftts, (c) analysis 
procedures, (d) time schedules, (e) staff responsibilities | 

(3) The design specifies a comparison procedure to estimate 
what performance would have been in the absence of the 
proj ect I 

(4) Methods to be used to Identify nonparticipanti^^f or 
comparison or another comparison standard (e.g*, an 
historical or statistical comparison) have been described | 

(5) Sampling procedures have been identified to ensure that 
the sample is representative of the project population; 

(6) Data collection and analysis procedures will address the 
evaluation queations and "are appropriate for use with 
the project data^i and 

(7) "The data obtained will contribute to Improvement in the 
operation of the project", 

(b) SEA Requirements ^ The law allows SEAs to apply for "technical 
assistance" contracts if, during the preceding year ^ an LEA within the state 
had received funds. These contracts may not eKceed five percent of the 
total LEA awards. Activities associated with these contracts may take the 
following forms: (1) Monitoring LM Bilingual programs, (2) Evaluating 
the impact of programs, (3) Facilitating the exchange of Infomationj and 
(4) Dissemination of materials acquired by the SEA to the LEAs. While 

the regulations specify that the application will be reviewed according 
to a point system and that the evaluation plan is to be specified, little 
guidance is provided as to what aspects should be considered. The review 
criteria for the evaluation plan is composed of a statement which simply 
mentions attributes such as "quality of the evaluation plan," "appropriate-- 
ness of the methods" and "to the extent possible methods should be objective 
and produce data that are quantifiable." Except for Basic grants and 
demonstration projects, the same "boiler plate" statement appears in the 
description of the evaluation plan for all of the other program catf lorles 
funded under this title* Such Btateroants regar^dlng_ evaluation compjanents 
are of l ittle use in guiding the development of fi^T^ eval^4tloq p3,^n, 

(c) Federal Evaluation Requirements . The law specifies the contents 
and schedule of the report to Congress. Beginning in 1980, and every two 
years after that, a report is to be submitted. The contents must Includes 
a national assessment of the educational needs, the extent to which these 
needs are being met, a flve--year plan (its costs and the needs for educationa 
staff), and a report on and an evaluation of the activities carried out under 
the title. 
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(3) Public Law U-Ul (Education fox the Handicapped) Evaluation 
Rsqulrementg 

The Education for all Handicapped Children Act of 1975 and the per- 
tinent regulations are eKpHclt about responsibilities. The states are 
the primary target of federal qversight and they in turn are responsible 
for overseeing the local education agencies. The program Is focused on 
the provision of a "free, appropriate public education for all handicapped 
children*? The Bureau of Education for the Handicapped (BEH) In USOE was 
asilgned the responsibility for administration and evaluation of P.L, 94-^142* 

LEA "Evaluatlgn" Requirements . At the local level, the term 
evmluatlon rafars primarily to diagnostic assessment of children. The 
regulations require that preplacement evaluation be conducted using 
multiple, appropriate assessment modes. If the child is found to have 
a handicapping condition ^ an Individualized Educational Plan (ISP) Is 
devised* The content of the Individual Education Plan is required by the 
regulations to include: (1) an assessment of present levels of educational 
performance; (2) a statement of annual goals and short tera Instructional 
objectives I (3) a statement of specific special education and related services 
and an assessment of the extent to which the child is able to particioata in 
regular education programs; (4) projected dates for initiation and termi- 
nation of services, (5) appropriate objective criteria, evaluation proced- 
ures and a schedule for reevaluatlon. 

(b) SEA Evaluat ion requirements * The state has responsibility to 
ensure that the lEP has been prepared and that It meets the educational 
standards of the state. This Is essentially a monitoring function and Is 
carried out through on-site visits. Elaborate checklists have been developed 
by state agencies and BIH for assessing compliance with regulations. Addi- 
tional nionltoring requirements Include fiscal audita and an assessment of 
Che ^tent to which the Individual Educational Plan is actually carried 
outj in practice* This latter function is essentially a check to ensure 
that the program for individual children is actually Implemented* 

The law specifies that In any fiscal year ^ the state may use five 
percent of the total state allotmenti under part B, or $200,000, whichever 
is greater for conducting required adminiatrative activities * Evaluation 
in the sense of monitoring is Included under this category of activities. 

The State Education Agency Is required to report (1) the number of 
handicapped children receiving services on October 1 and February 1 of the 
school year; (2) the number of handicapped children within each disability 
category; (3) fhe number of handicapped ^th each of three age groups. 
For all figures, undupllcated counts are required. This report Is to be 
transmitted to the Comilssioner. 

(^) l^aluation re quirements at the Federal Level * The Comlss loner 
has responsibility for evaluation under Section 618 of the Act. Specif 1- 
callyj the legislation authorises ^1) annual studies; (2) assessment of 
..the. adequacy of Information provided by state agencies; and (3) development 
of effective methods and procedures for evaluation. 
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(4) Evaluation Requirements for Voeatlonal Education 

Funding for Federal Vocational Education programs is of two basic 
types I Fomula grants to states and Discretionary grants* The evaluation 
proaess is different for each type. Here we only consider the evaluation 
requirements for the formula grants administered by the states. 

State administered Vocational Education programs require evaluation 
at the state and federal levels. At the state levels formal evaluation is 
routinely conducted by two groups | the State Department of Vocational 
Education and the State Advisory Council on Vocational Education (SACVE) . 
At the federal levels there is a parallel organisational scheme. The Bureau 
of Occupational and Adult Education (BOAE) within USOE and the National 
Advisory Council on Vocational Education (NACVE) serve as the federal level 
counterparts to the state agencies. The local administration of these 
programs is carried out by the district. The evaluation is typically 
informal J being composed of needs assessment and guidance regarding program 
operation provided by the Local Advisory Council on Vocational Education 
(L4CVE). 

(a) Evaluation Requirements at the State level . The law and regu-^ 
lations are explicit as to the content and procedures to be employed in 
the state evaluation. The evaluation Is structared around a five-year 
program plan. The legislation explicitly states that the purpose of the 
evaluation is to revise and improve the programs conducted under this plan, 
this plan is jointly devised by representatives of the State Department 
of Education and the State Advisory Council (SACVE), 

Stqte Department of Education requirements . During the five-year 
period of the state plan, the State Department of Education is to evaluate 
the effectiveness of each program In terms of (a) planning and operational 
processes, (b) student achievement, (c) student employment success and 
(d) Issues related to special populations. Further, the state is required 
to evaluate the extent to which Individuals who complete or leave the program 
obtain employment in occupations related to their training and whether their 
employers consider them well-trained and prepared for employment. Sampling 
is permitted for this assessment* Finally, the State Department of Education 
is required to submit an annual accountability report which Includes a 
description of how funds were used, a sumiary of the evaluations that were 
conducted and a description of how the evaluation information has been used 
to improve the state *s program. 

State Advisory Council requirements . Annually, the State Advisory 
Council is to prepare and submit to the Coimnissloner and National Advisory 
Council on Vocational Education, an evaluation report. Its contents are to 
include a synthesis of its evaluation of State Department administration and 
operation and the evaluations performed by the State Department of Education* 

Ct) Evaluation Requirements at the Federal level . An organlEatlonal 
stwcture, parallel to the state level, is established within the law for the 
federal level agencies. There are some notable dlffernences in the explicit- 
ness of the evaluation requirenents prescribed for the National Advisory 
Council, however. 
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Evaluation requirements for the Bureau of OcQupatlonal and Adult 
Education i At least ten atatea are to be reviewed during a given fiscal ^ 
year. The purpose of the review is to analyze the strength and weaknesses 
of state programs. At the same time* mw is to conduct fiscal audits 
within those states. The Commissioner is to transmit to Congress a report 
on the National status of the Vocational Education programs. The report is 
to Include Information developed from the National Vocational Education 
Data System (VEDS)^ a suimaary of information obtained from federal reviews 
and audits and a synthesis of the evaluations performed by State Departments 
and State Advisory Councils* 

Evaluation requirements for the National Advisory Council on Vocational 
Education t NACVE received a broadly stated evaluation function in the regis-- 
latlon* Its primary function is to provide policy-oriented annual reports 
and assessment of USOE^BO^ admtalstration and operations. 

Diversity in the Type of Evaluation Regulations 

Examining the amount and type of information that is required across 
the four programs It Is apparent that there are substantial differences. 
The direct grant type of program (e.g., Bilingual and the Discretionary 
grants for Vocational Education) have the least amount of oversight and 
reporting requirements. Title 1 and VocatloTral Education (Basic grants) 
are both state administered , formula allocation grants and have an additional 
level of evaluation imposed by the state agency. Vocational Education can 
be distinguished from Title I in that two agencies at the state and two 
agencies at the federal level are responsible for conducting routine evalu-* 
atlons. From this comparative assessment^ we see that not only do the law 
and regulations indicate how evaluation is to be carried out, it can also 
Influence how much la conducted and by whom. 



3.4 HOW EVALUATIONS ARE CONDUCTED AT STATE AND LOCAL LEVELS 

Although the regulations and legislation are directive as to evaluation 
requirements under each funding source, the way programs are evaluated at 
state and local levels varies dramatically, making overall statistical 
characterization difficult. As a consequence, types of operation are 
described and illustrated, 

EKperience derived from field Interviews at six State Education Agencies 
and one Local Education Agency within each of these states suggests that LEA 
practices are Influencad by educational and evaluation practices of the SEA. 
Attention, then, is directed first at the SEA level of evaluation. 

STATE LEVEL 

Until recently, the SEA was primarily an administrator of federally 
funded educational programs that operated in the LMs. ^tondated activities 
included: fiscal auditing, compliance auditing and the aggregation of data 
reported from LEAs* The SEAs in turn, reported to the Commissioner. With 
Increased Congressional interest in local level evaluation, the SEAs have 
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been given additional evaluation authority* Recent Title 1 legislation 
requires the SEA to provide technical assietance to the LMs for the 
purpose of ijsprovlng local evaluation efforts. Program planning and 
evaluation, identification of exemplary programs, and the dissfflnlnatlon 
of exemplary practlcei are some additional activities that SEAs have 
come to perform* 

As might be expected * State Agenclea i^ary with respect to their level 
of involvement in the evaluation of federal programs* Part of this state- 
to-atate variation is due to the educational organization within a state 
and some of the variation is due to the State -s own Investment in educa- 
tional programs* We found some states with a rather substantial monetary 
Investment In programs that are similar to federal programs i notably 
State Compensatory Education and Bilingual Education. For these cases. 
State legislative Interest In evaluation and accountability are a driving ' 
force behind the development and maintenance of a strong state evaluation 
component. As a consequencei state level evaluation capabilities have 
been strengthened^ federal programs being a primary beneficiary of in- 
creased aHpertise, The Impact of State Interest in evaluation on the LEAfts 
practices can be remarkable i resulting in many LEAs substantially going 
beyond federt uatlon requirements. 



Types of State Education Agencies and Evaluation Practli 
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For simplicity, three types of SMs are identified as a way of 
categorizing State level evaluation of federal programs. We offer these 
characterizations only as a first attempt to describe their role in the 
evaluation process. Little is known about the activities of SMs with 
respect to the evaluation of federal programs, except of course what is 
required of them by federal regulations. These types are labeledi 
EKemplary SEAs, Compliance-Oriented SEAs and those falling between the 
two, referred to as Emergent SEAa. This rough classification acheme was 
derived as a result of our field visits, conversations with federal staff 
and our review of reports obtained through telephone solicitations. No 
statistical characterisation of the prevalence of each type can be offered. 

Since these catagorles provide a basis for distinguishing ataong SEAs, 
brief siaamartes of their distinctive features are provided. These des-^ 
crlptlons are not intended to be complete portrayals of each type, they 
are merely thumbnail sketches which sunmarlge, roughly, the discrepancies 
found among SEAs with respect to the way federal programs are assessed* 

(^) EKemplary SEAs * These are characterized by an active Interest 
ln> the evaluation process at numerous levels of state organization. These 
levels Include I the State legislature, the program sector and the evalu-- 
atlon component* Corroon elements anong states of this type which seem to 
be responsible for this interest include i administrative or legislative 
fiscal support for the state level evaluation component, few administrative 
levels between the evaluation component and the Chief State School Officer, 
substantial monetary investment in State funded educational programs and 
strong public support for educational accountability. Prom an evaluation 
perspective, perhaps the most salient charaGteristic of this type of SEA 
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±e the prasence of a well articulated evaluation plan which encompaasei 
itate and federal programs within the program planning, implementation, 
evaluation and dissemination processes* The impaGt of this overall plan 
is seen throughout the SEA, across program areas and it permeates the LEA 
level; influencing evaluation practices and program administration. 

(b) Complianae oriented SEAs * The absence of one or all of the char^ 
acterlstics identified for Exemplary SEAm is likely to Influence staffing 
levels, monetary support and/or technical capacity (e.g., computer faclli-- 
ties) that make it simply impossible to go beyond the minimal requirements 
established by federal regulations. As a consequence, efforts arei by 
necessity, directed at ensuring that the minimum standards are adhered to 
and there is little opportunity to do anything else. For the purpose of 
federal reporting requirements, a compllance^only mode of operation should 
not be viewed negatively so long as the quality of what is reported is high 
or at least the level of quality is known. In these cases, the role of 
the Technical Assistance Centers Is likely to be extremely Important, 

(c) Emergent SEAs . Recent developments In several arenas have con-- 
tributed to improved evaluation practices at State Departments of Education, 
These factors include: federal development of specific guidelines pertaining 
to evaluation requirements, Technical Assistance In the taplementation of 
these guidelines, Increased availability of trained evaluation personnel, 
and direct federal assistance designed to strengthen SEA capabilities, SEAs 
classified as "emergent" are those which are no longer simply complying 
with regulations. Instead, agency-- initiated practices are developing with 
respect to how evaluations are to be carried out for their own needs. 

Emergent SEAs can be distinguished from the aKemplary category on a 
number of dimensions. A primary distinction is the extent to which the 
exemplary practices are exhibited across a variety of programe. On this 
dimension, an answer to the question how or how well are evaluations con^ 
ducted would require an answer, prefaced by the statement, "it depends on 
which program you are referring to," An additional distinction that might 
be drawn between these SEAs and the exemplary variety pertains to the 
Institutionalization of the enterprise " evaluation In the emergent states 
is only how beginning to gain credibility * 

Illustrations of State Level Evaluations 

Taking into consideration the discussion of elements comprising the 
evaluation process described in Section 3,1 and federal requirements 
described In Section 3,3, the most saliant issues pertaining to how evalu^ 
atlons are conducted at the state level can be summarized as follows i 

li Program planning and approval | 

2. State on-site monitoring i compliance and program reviews; 

3i Specification of reporting requirements • % 

4. State level analysis, aggregation and data quality control. 
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The extent to which each of these elements la addressed within state 
level evaluations is the focus of this discussion. Illustrations which 
highlight the distinctions among the three types of SEA will be Mde wherever 
possible. 

Pgogrsm planning and program approval . The legislation and 
regulations assign responsibility to the state for approval of LEA appli- 
cations for progrM funding and the evaluation plan* This function Is 
particularly relevant for state administered programs. Two basic strategies 
were observed in the SEAs we visited i (1) "on-paper'* review to ensure that 
the program as planned la consistent with federal regulations and (2) 
compliance review plua an "educational-quality" review. Examination of local 
and state reports suggests that this latter form of planning is more typical 
of eKemplary SEAs. However, there are notable legislative and regulatory 
exceptions across programs that attempt to induce better pianning* The 
legislation^ for example, pertaining to Vocational Education requires the 
state agency to explicitly Indicate how the previous yearns evaluations 
have bean used to improved program operation. A similar requirement within 
Title 1 applications was observed in a few sites, California employs the 
notion of a Master plan in order to achieve continual improvement of 
educational practices. In support of this effort the SEA disseminates 
written material on program Improvement and conducts site visits. In 
California, this aspect of the educational planning process is part of the 
development of a Consolidated Application plan which allows LEAs to devise 
programs relevant to Individual LEA needs. Title I and four state funded 
programs are involved In the California 'a Consolidated Application process. 
Our Interviews with State Department personnel suggest that this coordinated 
effort results in better targeting of funds for specific student needs. 

The more coimon planning strategy appears to be a review process where 
programs are first examined "on-=paper" to ensure they meet federal standards. 
At this point, the state either approves the program or reconroends modifi- 
cations. In well established programs, such as Title I, the submission, 
approval and funding process can become routlnlzed, resulting in little 
substantive change over time. Consequently, the same goals, level of attain- 
ment and the like could be specified from year to year. 

Regulations and legislation provide for a certain amount of flexibility 
as to how programs can be operated. This discretion translates Into numerous 
choice pQlntg regarding the operation and Instructional characteristics of 
local programs. Our experience suggests that In many cases these choices 
are made In the absence of systematic tests as to the impact of one alternative 
or an other, We found little attention being directed at the state level 
towards ayatematlc tests of alternative means of delivering program services* 
Such alterations could be easily designed into the program planning process 
and systematically examined. The feasibility of introducing system atic tmmtm 
of program alternatives for ongoing programs Into the planning process can be 
Illustrated with a couple of examples. 
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Example A| Testing progrffln components. Title 1 has regulations specl-- 
fyS^g that funds can only be used for the purpose of supplenent ing, not 
supplanting* regular education efforts* In the plan submitted by LEAs, the 
manner in which this regulation Is to be satisfied is specified. This 
typically Involves the selection of a particular educational method (e.g*» 
pull-^out). Given the state -s oversight responsibility for approving such 
organizational/operational schemes , there appears to be a considerable 
amount of room for SMs to encourage LEAs to systematically examine the 
impact of differing project level operating schemes* In our site visits, 
we did not encounter the use of this type of planning/evaluation strategy. 

EKampla B: Program variations . Length of exposure and mode of 
Instruction, Independent of program settings p represent additional aspects 
of program operation which could be examined rigorously by SEAs provided 
adequate attention is paid to methodological issues during the planning 
phase. Through coordination with the State Agency * different "dosages" of 
exposure- time could be allocated to a representative sample of capable LEAs 
in order to assess the impact of exposure on achievement* 

By virtue of the state's oversight responsibility for approval of the 
educational and evaluation plan to be carried out by the local agency^ 
urging the SEA to select those LEAs who have adequate resources to particl-' 
pate in these types of systematic tests of alternatives seems sensible, 

(2) On-site monitoring g compliance and program quality review * In the 
evaluation literature, it is well knowi that what is planned is not always 
the same as what Is ultimately implemented in the operational field setting. 
The administrative analogy of this principle is compliance monitoring. The 
regulations specify guidelines pertaining to compliance and SEAs are res- 
ponsible for ensuring that the LEA adheres to them, in practice. 

States differ with respect to the amount of on--site monitoring that 
is conducted* In addition to the fact that states have varying degrees of 
monetary investments In programs similar to the federal programs j the 
federal allocation for administration , of which monitoring is a component, 
is poorly structured* A set-aside of the total state allocation with a 
fixed ceiling limit has been designated within the law. The amount of the 
set-aside is sufficient in some cases but strongly favors those states with 
few LEAs, States with a iarge number of districts, many with 300-500 district 
receiving some federal funding, find it difficult to monitor the troublesome 
sites let alone all the sites. Of coursej representative sapling of sites 
is suffidient for monitoring purposes to obtain a statistical characterization 
of compliance but If the issue is to obtain an outside check on the extent 
to which programs are implemented as planned, "spot-^checking" would be 
insufficient. 

An exemplary Instance of & progrOTi-quallty-revlew procedure is offered by 
California. In addition to examining program compliance, quality of the 
school program is assessed through site visits using the Program Quality 
Review Instrument (PQRl) . As part of this review process, the California 
State Department of Education recently conducted an assessment of this review 
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proeesa by aollciting opinions of those individuals who participated. 
Similar follow-up procedures were instituted by the Vocational Education 
evaluation perionnelt this assessment was designed to ascertain the 
extent to which reeoomendations prescribed by the site review team were 
adopted by the local agency* 

Well defined compliance monitoring systems were observed for numerous 
programs across SEAs that ware visited. In many instances, state procedures 
are more extensive in scope than federal regulations prescribe. For eKamples 
compliance monitoring for 94-142 is coupled with more stringent compliance 
regulations of State Programs In New Jersey and in Massachusetts, though 
each employs somewhat different procedures tailored to state legislative 
requirements * 

TOe importance of compliance monitoring, especially coupled with 
program quality review cannot be understated. Current administrative 
procedures are such that, more often than not, the evaluation of local 
level programs Is insufficiently funded to allow outside consultants to 
perform these audits* However, given the flKed set-asific^ allocation for 
program administration under certain title (e.g.. Title I, Bilingual), 
similar staffing problems are likely. This appears to ba an area whera 
more attention Is needed. 

An admirable use of Bilingual Title Vll funds at the State level for 
Improving the monitoring and ultlmataly the reporting practices of LEA 
grantees was observed in MaBSachusettB^ A series of studies was coimnlssioned 
by the Stat<* Bilingual Office to examine state Title VII programs and their 
evaluations. One result of these investigations was the development of 
contract specifications, reporting guidelines and standards for Bilingual 
evaluations. Further, the plan developed by the outside contractor was 
scheduled to be pilot-tested prior to full-scale Implementation. From a 
planning perspeGtive, these practices are admirable and should be promoted. 
Similar, thoughtful planning and monitoring was observed in other program 
areas in the Massachusetts SEA, 

(3) State specification of reportlnji requirements . Judging from 
recent legislation. Congress has a considerable Interest In obtaining 
program information and outcome evidence on a nationwide basis. This Is 
not a new concern. The distinctiveness of the recent legislation is that 
it represents a more direct request and, in many cases, substantial 
monetary backing is provided (e.g.. Title I)* From the Federal perspective, 
a major concern is that data collection and aggregation procedures are 
comparable across states — the primary federal goal is national-level 
aggregation. The SEA is ultimately responsible for ensuring that federal 
mandates are fulfilled. These data collection and aggregation procedures 
are aomion tasks ascribed across program areas for state administered 
programs. The ultimate utility of this evidence at the Federal level Is 
dependent upon state-to-state consistency regarding what Is collected, 
in ^at format and by whom# 
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Statai and programs within states vary considerably as to how report- 
ing requirements are fulfilled* Most states rely on a two=phase aggregation 
procasa. Typically, the LEA collects project level Information and perform- 
ing the first level of aggregation. The State then receives LEA reports 
performing the second aggregation* 

The possibility for error Is compounded at each level of aggregation 
and the ability to account for distinctive features of a particular project 
becomes more difficult. The necessity for coordination within and across 
states is evident from the recent flurry of activity associated with the 
development of Information reporting r//steras in Compensatory Education (TIERS) * 
Vocational Education (VEDS) , and Education for the Handicapped through BEH, 

In an effort to understand the issues related to obtaining high quality 
data from LEAs, current data collection procedures for Title I reporting 
were examined* Specifically, the contents of Title I Annual Evaluation 
Reports for 1978-1979 submitted to 10 different states were examined. Table 
1 provides a suMnary of the information that is required of LEAs under the 
auspices of these •'Evaluation" reports. In Table 1, an "X" designates that 
this information Is required ^ a "0" Indicates it is not even mentioned and 
a signifies that it is Impossible from visual inspection to ascertain 
whether the item is required or at the discretion of the LEA» In specifying 
the possible content that might appear in these reports ^ Title I requirements 
and elements that were characterized earlier as part of the evaluation process 
were used as the basis for comparing state evaluation reporting schemes. As 
such, the comparison across states is devised to include information on 
characteristics of the program, parent advisory council, staff, testing and 
other desirable evaluation components, EKamination of the pattern of entries 
in Table 1 across states for each information element * reveals a substantial 
amount of overlap on what is reported. For axample, the testing cycle (spring 
to fall, spring to spring, or fall to fall) is clearly identified in all 10 
State forms. Also, the reviewer can easily discern whether scores are reported 
for only students with both before and after scores ^ the grade level is always 
specified and the program area (Mathematics * Reading, etc.) is easily detect- 
able. Further j Title I staff characLeristlcs and Parent Advisory Councils 
characteristics are almost uniformly reported. 

Areas where there is less consistency across states Include test Identl-' 
flcatlon, explanation for the discrepancy between the number of students served 
and the number tested, and method of scoring the test. Each of these factors 
is Important in the aggreg^^-ion process in order to understand the quality of 
the data. Of substantial pwlicy relevance is the Identification of progr™ 
characteristics. It is well taiown that progrms and their modes of operation 
differ. We see that program setting is often omitted in these reports. Some 
program settings Involved '^pulling" Title I students out of their regular class- 
room^ others are self contained classrooms. Student exposure to the program is 
another variable of interest, and as late as the 1978-1979 school year it was not 
uniformly reported. This variable can be used to illustrate yet another level of 
complexity that must be considered when data are reported from the LEA, Mthough 
the exposure laveJ was reported by many states^ closer Inspection of the way it was 
rtiported rsveals considerable variation. For example, it appears that sOTe LEAs 
reported the amount of exposure that they had Intended to provide each student. 
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Table 1 

Evaluation information required in LEA 1978-1979 Annual Evaluation 
Report for Title I Progr^s (Regular) for Ten States 
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Composition of committee 


X 


K 


0 


X 


X 


? 


X 


X 


X 


X 


Activities of coranittee 


X 


X 


0 


X 


X 


? 


X 


X 


X 


X 


Title I Staff 


X 


X 


0 


X 


X 


X 


X 


X 


X 


X 


staff inservlce 


X 


X 


0 


X 


X 


0 


X 


X 


X 


X 


Test Information 






















Tasting cycle 


X 


K 


K 


X 


X 


X 


X 


X 


X 


X 


Complete test identification 


X 


K 


K 




X 


X 


7 


7 


0 


0 


Method of scoring 


0 


X 


0 


0 


0 


0 


0 


0 


0 


0 


Matched on Pre/Post scores 


X 


X 


X 


X 


X 


X 


X 


7 


X 


X 


Explanation for missing scores 


X 


0 


0 


X 




? 


0 


7 


0 


0 


Teat Results 






















Reported *'on level" 


? 


X 


X 


0 




X 


7 


7 


7 


7 


Reported by grade level 


X 


X 


X 


X 


X 


X 


X 


X 


X 


X 


Reported by school^ project 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Reported in NCE's 


X 


X 


0 


X 


0 


0 


X 


X 


X 


X 


Additional Elements 






















Narrative with Interpretations 


0 


0 


0 


X 


? 


X 


7 


0 


0 


0 


Reconanmdatlons 


0 


0 


0 


X 




X 


X 


0 


0 


0 


Assessment of other objectives 


0 


0 


0 


? 


0 


X 


X 


0 


? 


0 


Dis seninat ion/Feedback 


0 


X 


0 


0 


X 


7 


X 


X 


0 


0 
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others seem to have provided estimates of the actual amount of time the 
children receivad services. If the issue is to identify those programs 
that are exemplary^ consistency of reporting seems essential. This level 
of detail is important and a lack of specificity is certainly a source of 
confusion for those who must comply with reporting regulations. The new 
Title I reporting system (TIERS) is designed to provide explicit indicators 
of program characteristics which will improve the degree of consistency 
within and across states; its quality remains to be seen, however. 

The last few rows of Table 1 show that there is considerable diversity 
in these reports, across states , with regard to the Inclusion of narrative 
summaries J recommendations, assessment of objectives other than achievement 
and dissemination/feedback. These characteristics were Isolated earlier as 
important aspects of the evaluation process , however they are given little 
attention in state reporting requirements. As a consequence, the "evaluation 
reports" are in many cases merely summaries of head counts, aggregate test 
scores and little more. 

W State level aggreEatlon and quality eontrol i The aggregation 
process can take basically two forass (a) translation of what Is received 
into conmon units and sucmatAon or (b) in addition to these tasks, further 
analyses/evaluation may be undertaken. The Michigan, California, and New 
Jersey State Departments of Education represent instances of the latter 
category. Michigan used data from Its 1975=1976 Title I evaluation report 
to examine program effectiveness by examining bullding-level programs 
within types of school district. California's assessment of Title I (and 
State funded programs) included the use of multiple achievement tests , 
program quality reviews obtained through site visits, and quality assessmet ts 
of the data reported Erom LEAs. Similarly, New Jersey regularly assesses 
and reports its evaluation of Title I and the State Compensatory Education 
Programs simultaneously. 

More typical of the state reports that were reviewed is aggregate 
reporting with varying degrees of attention to data quality. Notable ex- 
ceptions are those State Departments who explicitly list the amount of data 
that has been excluded ^ the reasons for exclusion and the bias thPt is 
likely to result due to their exclusion from the aggregate analysis. Further^ 
this type of careful data management has been found to be useful in identi- 
fying LEAs which should be targeted as candidates for technical assistance. 

Conducting evaluation of federal programs through the aggregation 
of data elements from diverse sources requires that careful attention be 
directed at the quality of the collection process. This la an expensive 
propoeition* It Is not sufficient to simply provide data forms that give 
the appearance of generating, compatible data. Close monitoring of the 
process is mandatory. The Census Department regularly conducts validation 
audits i This practica is similar * in principle, to the required audit of 
IndividuallEed Education Program plans (IlES) undertaken by BEH, Consider^ 
atlon should be given to the idea of routinely assesaing the validity of 
Information that is reported to the SEAs. BEHi in preparation for the 
implementation of reporting requlrementg associated with Fublic Law 94^142, 
commlasloned an assessment of state counts of handicapped children and 
produced a handbook for conducting future validation studiei. Systematic 



3=21 



sampling of LEAs seems to be an economical and sufficient means of 
asseseing the quality of data elements included in reporting systems. 
However, given the diversity in types of data collected under various 
titles, it is likely that audit procedures tailored to the specific 
data system will be necessary* 



HOW EVALUATIONS ARE CONDUCTED AT THE LOCAL LEVEL 

Despite the specificity of regulations, there exist LMs who perform 
remarkable evaluations. Unfortunately, when we view these in the context 
of the 16,000 LEAs who participate in federal education programs^ these 
exemplars seem to be the exception rather than the rule. Even in the 
large districts that were site-visited, mixed levels of practice were 
observed some merely fulfilled evaluation requirements and others went 
substantially beyond. 



Existing Evidence on How Evaluations are Conducted Within LEAs 

Catherine Lyon and others, at UCLA*s Center for the Study of Evalu- 
ations (CSE), identified LEAs with enrollments in excess of 10,000 students 
and an evaluation/research unit, Mong other issues^ Lyon and others were 
Interested in characterising types of evaluation activities carried out 
within these units. Directors of the research unit were asked to complete 
a questionnaire pertaining, in part, to the activities and relative amount 
of time the unit devoted to these activities. Roughly 230 Directors 
responded to the questionnaire . 

The CSE data is interesting in that it provides a rough indication of 
the extent to which local districts, with evaluation units, perfo™ certain 
types of activities. Further, since each activity was ranked according to 
the relative amount of time it consumer, a crude characterisation of the 
methodological emphasis within these units can be devised. Table 2 reports 
pertinent data from the CSE study. 

Examination of Table 2 reveals that all districts collect information 
on student achievement and 95% of the Directors listed this category as one 
of their three most time consuming efforts. Given the emphasis on achieve- 
ment testing in the regulations, this is not too surprising* Information 
on the relationship between student and/or classroom characteristics and 
achievement are less likely to be collected, nor are they ranked as being 
time consuming; nearly 60% reported that they did not engage in collecting 
this information. The relationihip between socio-economic status and 
achievement is collected in fewer than half the districts. With respect 
to time allocations, these activities were ranked as being one of the three 
most time consuming activltiea by only 26% of the Directors. 

Table 2 also shows that data collection by means of testing is almost 
universal (98*8Z), followed by survey questionnaires (93. 8X). ^e prevalence 
of interviews and classroom observations Is considerably lower (65. 2% and 
60.4%, respectively). Turning to the evaluation activities performed by 
evaluation/research units, It can b« seen that there 1» high variability among 
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Table 2 



Activities, Functions and Methods Mploy^ by 
Evaluation/Research Units at LEAsi CSE Results 



Evaluation Activities 


Percent of 
Districts 


^nked 
1,2 or 3 


Collection of information; (12 original items) 






1, 


Student achlavramt 


100% 


94.7 


2. 


Relationship between ichool/clasaroom 


41.5 


25.6 




characteristics and achievement 






3. 


Relationship between student achievement 


49,3 


17,6 




and socioeconomic status 






4. 


Relationship between students ' race/etlmic 


40.1 


16,3 




background and achievement progress 






Methods of Data Collactioni (5 original itQns) 






1. 


Testing 


98,8 


94,2 


2, 


Survey quest ionna ires 


93,8 


89,0 


3. 


Interviews 


65.2 


43.2 


4. 


Classroom observation 


60.4 


37,9 


Evaluation activitiei: (11 original itms) 






1. 


Assess the results or worth of instructional 


88,1 


69,6 




programs 






2, 


Assess student achievement of obj stives 


91.1 


66 


3, 


Identify /appraise educational goals or 


77,1 


34.4 




objectives 






4. 


Compare the districts' achlevraent test 


81,0 


27.7 




scores with scores outside the district 






5. 


Determine pupil and/or public satisfaction 


65.1 


22.4 




with school or programs 






6. 


Check that implementation conforms to 


63 


20,7 




program specification 






7, 


Jtodify programs using evaluation results 


63.8 


14.5 


8. 


Approve evaluaticn sections of program 


71,8 


11.9 




proposals 






9, 


Assist in the selection of instructional 


45,9 


7.1 




programs 






10. 


Compare costs/benefits of alternate 


22.4 


1.3 




instructional programs 
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districts as to the activities performed and the relative amount of time 
devoted to each activity* Assassing the results or worth of a program 
examining the achievement of objectives are carried out by at least 
88% of the district, each being ranked as one of the three moat time 
consuming activities by about twQ-=thlrds of the Directors, Comparing 
district results with other districts, asseising satisfaction with the 
school or program, approving evaluation plans, and modifying programa 
using evaluation results Is a relatively frequent activity in that at 
least two-thirds of the Directors claim to perform these efforts^ yet 
they are not often ranked within the three most time consuming activities. 
Further, even though checking to ensure that the program is implemented 
as planned is a crucial yliase of the evaluation process, 37% of the 
Directors report not engaging in this activity. For those who do, it is 
ranked as one of the three most time consuming activities by 21% of the 
Directors. Analysis of alternative instructional programs Is conducted 
within 23% of the districts and is ranked as a time consuming activity 
by less than 2% of the directors, 

Webster and Stuff lebeam examined how evaluations are carried out 
within evaluation units in urban school districts, by obtaining estimates 
of budget expenditures for a variety of evaluation activities. The 
Webs ter=S tuff lebeam analyals is based on thirty-five urban school districts 
categorized according to size of the evaluation, research and testing 
budget: one million dollars or more (large), $300,000 to $999,999 (medium), 
and $50,000 to $299,999 (small). The percentage of budget expenditures 
for each of their classes of activities appears in Table 3. In addition 
to the average percentage within each district size category and weighted 
average across district size. Table 3 presents the range of values composing 
each average. The ranges are Interesting in that the presence of a 0 as a 
lower value Indicates that none of the budget is expended for that activity. 

Roughly thirty to forty percent of the budget is expended on testing 
and assessing whether the program is successful In meeting its objectives, 
or is more successful relative to an alternative method of instruction. 
Providing evidence regarding implementation of the program consumes an 
additional ten percent of the budget, on average. The same is true for 
data processing. Further, providing assistance to other district personnel 
(research consultant), proposal development, providing ad hoc information 
and instrument development, combined, constitute roughly 20% of the budget. 
Needs assessment and diagnosis of constraints upon meeting needs (context 
evaluation) and evidence regarding availability /use of resources (input 
evaluation) represent less than 7% of the budget. 

Another Important consideration in describing activities in LEAs is 
the degree of diversity among LEAs. For example, testing* in the aggregate 
represents less than 20% of the budget expenditures. Examining the ranges , 
we find as little as two percent and as much as 70% expended on testing. 
Similarly, product evaluation .jucceas, impact of the program) coimnands as 
little as 5% to as much as 50% of the budget. Of particular Interest are 
the 0% values. Across districts, there are some districts (within each 
category) that do not allocate any money for input evaluation, the same is 
true for the regearch consultant and proposal development categories. There 
are some medium and small districts who do not allocate much (if any) of 
their budget to assessing the implementation of the program. 
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Table 3 

parcaitaga of Budget Expenditures for Evaluation Activities; 
Webster and Stufflebeam (1978) 



Evaluation Activities 



District Size 
Large Medium Small 
(9) (13) (13) 



19.0% 
(6-50) 


19.6% 
(10-40) 


17.9% 
(5-50) 


18.8% 
(5-50) 


11.8 
(2-22) 


23.4 
(4-70) 


19.0 
(5-45) 


18.8 
(2-70) 


JL J t u 

(5-25) 


/ * b 
(3-16) 


J-U « z 

(5-15) 


in n 

(5-25) 


14.0 


10.9 


5.5 


9.7 


5.8 
(3-10) 


8 

(4-19) 


10.7 
(5-18) 


8.4 
(3-18) 


3.8 

(0-15) 


4.4 
(2-10) 


10.8 
(3-20) 


6.6 
(0-20) 


w . O 

(2-18) 


g 

(2-15) 


3 3 
(0-13) 


(0-18) 


4.6 
(0-15) 


4.4 
(2-10) 


4.9 
(1-11) 


4.6 
(0-15) 


3.6 
(0-12) 


4.7 
(0-10) 


4.3 
(0-8) 


4.3 

(0-12) 


3.9 
(0-11) 


4.3 
(0-10) 


4.0 
(0-10) 


4.1 
(0-11) 


6.9 


2.4 


3.8 


4.1 


3.7 
(1-12) 


4.1 

(2-15) 


4.0 
(0-10) 


4.0 
(0-15) 


3.1 
(0-6) 


.2 
(0-3) 


.6 

(0-5) 


1.1 
(0-6) 



1* Product j&^aluatlqn (Assessment of the 
relative success of the program in 
meeting its objectives relative to an 
alternative net hod of instruct ion ^ 
coat/benefits) 

2* Test tog (OpCTation of the ayatem to ensure 
quality instrtanentat ion and reporting). 

3 , Da t a Pr oc eg s ing (Actual operation of basic 
infornation system)* 

4. Process Valuation (Providing informtion 
on factors affecting Implementation and 
for aiding the interpretation of program 
evaluation data) * 

5* ttanagement of evaluation relate resourcea. 



6, Ad hoc information (Provision of request^ 
inforMtlon on an ad hoc basis) * 

7* ConteKt Evaluation (Assessment uf ne^Ss 
description of outcomesj actual and deslr^ 
diagnoais of problems that prevmt needs 
from being met) , 

8. Planning services (T^hnical assistance in 
planning and managing projects or programs)* 

9* Research Consultant (Assistance design and 
analysis of projects conducted by other 
district personnel) , 

10* Proposal Development (Development of pro- 
posals or the evaluatiou sections of pro^ 
posals for outside fundings) , 

11, Other research (Basic and appliei) , 

12, Instrument development and validation 

13, Input Evaluation (Providing information on 
resource availability and utilisation for 
accomplishing goals)* 
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The CSE study and the Webster and Stufflebeam study, while focused 
on different aspects of how evaluations are conducted at the large local 
districtB, reveal a common theme — the primary mode of evaluation at the 
local level is directed at assessing whether a program has met its 
objectives and a considerable amount of time and money Is devoted to 
testing as a means of describing project success. Fewer resources are 
devoted to process assessment, though It receives more financial support 
than needs aasessment type accivlties and assessment of the adequacy of 
resources for meeting goals/objectives. Further, bhere is substantial 
variability in the way districts allocate resources and the amount of time 
expanded on various types of evaluation activities. 

Both of these studies approach the question of how evaluations are 
carried out from a research unit perspective. Little is known about the 
distribution of activities for specific programs. Recalling the diversity 
of evaluation activities specified in the legislation and regulation across 
programs, it would be expected that not all of the evaluation activities 
would be equally relevant. Based on the data presented above, it would be 
inaccurate to characterize any particular evaluation as being composed of 
104 impact assessment, 20% testing, 10% monitoring and so on. To obtain 
a better understanding of how particular evaluations are conducted, it is 
necessary to axamine what procedures are employed for specific programs. 
Unfortunately, little data bearing on this issue is available. Some survey 
data are available, however. 



The National Center for Education Statistics recently examined the 
methods employed by LEAs for the evaluation of Title 1. In particular 
Chey focused on how frequently each of the three 01 Title I Models were 
used. Using a national probability sample, it was estimated that 87 percent 
of all districts had Title I programs in 1978-1979. Of those districts 
with Title I programs, 63 percent used an evaluation model. Of those 
using an evaluation model, 90% voluntarily followed one of the three OE 
evaluation models. The remaining 10% used a local developed model. Each 
OE model entails before-after testing of students, they differ as to what 
torm of comparison is employed. The simplest OE model entails using the 
publisher s norms as the bails for comparison. This model was used by 
tib4 of those districts who used any model. The remaining 4% was split 
between the control group model (Model B) and the special regression 
model (Model C) developed bv OE, both of these require testing students 
who did not participate in Titlt I. Technically these latter models 
are considered to be methodologically more sophisticated, providing less 
equivocal estimates of gain that can be attributed to the program. The 
NCES survey shows that models employing comparison groups are infrequently 
used. The predominant mode of achievement assessment Is through comparison 
with norms, that is, what a sample of non-Tltle I students would have 
achieved given their Initial level of performance., 

Across the three studies, there appears to be a substantial emphasis 
on testing as a means of evaluating programs. The NCES survey shows 
extensive use of the before-after norm-based assessment. The use of other 
methods la infrequent and NCES estimated that districts would require 
additional assistance in developing more complete evaluation practices 
(e.g., continuous program Improvement, selection of non-test based 
evaluation devices) . 
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Reporting requirements for many programs emphasiza achievement 
testing, Aa a consequence, it seems reasonable to expect that fulfilling 
minimum requirements , for example^ before and after testing of dhly those 
students who receive Title I services* would be the most frequently observed. 
However, while alternate methods are used Infrequently^ the fact that 400 
districts were estimated to be using some form of comparative assessment 
suggests that they are feasible and their use should be promoted. 

Another glimpse at how evaluations are conducted can be obtained by 
eKamining the procedures used by projects that have been approved by the 
Joint Dissemination and Review Panel ( JDRP) . Each year a brief summary of 
the approved projects and the evidence supporting its effectiveness is 
pphlished in a volume entitled Educational Programs That Work . The most 
recent edition contains 34 newly approved programs | methods employed for 
these most recent programs were examined. Specif Icallyj of the 34 programs 
listed* 62% employed comparison group designs, 56% employed standardized 
tests or a combination of standardised and locally developed tests and 
roughly 30% involved some form of replication where the program was evaluated 
more than once, revealing consistently positive results. For approved 
exemplary projeets, the most comnon design entailed comparison of project 
performance where one group did not participate in the project or program* 
It is also of interest to note that 38% of the evaluation didn't employ 
comparison group deaigna. The most typical being a before-after assessment 
of the project participants only. In many cases the evidence supporting 
the achievement gain was supplemented with additional evidence that the 
program was sufficiently implemented; that the gains could be replicated 
from year to year; and that the use of multiple testing devices revealed 
consistent evidence. 



Illustrations of Evaluation Practices at ^^EAb , 

Up to this point, the discussion has attempted to provide statistical 
characterizations of how evaluations are conducted on LEAs* We have made refer^ 
ence to the fact that multiple procedures are employed in an individual evaluatic 
effort. To illustrate the process of conducting evaluations at the local level, 
case studies of five evaluations encountered during the site visits are provided, 
Itoterial for the last case study appeared In a recent publication* 

Case I , In site D, the evaluation unit and the instructional programs 
are administratively and fiscally Independent* To avdld misconmiunlcatidn 
during the course of an evaluation, a team composed of a member of the 
evaluation unit and a program specialist from the district's Instructlhnal 
Programs Division is assigned to each school. The team assists the 
school personnel in dri^ftlng and measuring their program objectives 
which ultimately becoraw a component of the application for funding. This 
approach was devised to ensure that advice regarding curriculum and issues 
pertaining to evaluation of the program were resolved by appropriate 
experts* 

For program development the team negotiates with the school personnel 
In forming appropriate objectives | a second team reviews the plan as a 
form of eKternal validation. In the planning process, previous evaluative 
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evldenca ±g considered. This evidence is provided by tha evaluation unit 
and entails an indivldauliEed report which sunmiarizes the extent to which 
previously stated objectives have been attained* Test results are sumarized 
in graphic form so as to facilitate cottmunlcatlon. Through periodic mon-- 
Itoring of the classroom acitivies, process ^formation is collected and 
systmat Ically reported to the school personnel. 

Case II . The evaluation of Title I in site C is best described as com- 
pl lane e-or lent ed , The evaluation unit produces a report that is in accord- 
ance with SEA reporting requirenents* 

To augment the minimal testing information provided by the evaluation 
unit 5 the Title I program staff engage in a variety of evaluative activities 
that they initiate and conduct^ Jjidependait of the evaluation unit* For 
eKMple, one component of the program—a kindergarten language development 
progrm— was assessed In terms of language skill achlevromt scores. The 
testjjig and data analysis was done by program speech therapists and the 
program coordinators. Technically , this report was not very sophisticated} 
Inferences as to the effects of the program were overstated. Other ^raples 
of locally Initiated evaluations conducted by program staff taclude an 
assessment of parent and project staff needs and an evaluation of one of the 
Inservice workshops. Both of these aasessments were carried out through the 
use of locally developed questionnaires or modifications of existing tostrument 
In the absence of cooperation from the evaluation unit, the program staff in 
this site resorted to conducting their own evaluations. 

Case III , The evaluation of Title X in site G is similar to that observed 
in Case I^ above. The major eKception is that^ while the evaluation unit 
is administratively independent of the programj they are not fiscally to» 
dependent of the program. The Director of Federal Programs is responsible 
for the operation of Title I in this district, evaluation money is also 
distributed by this individual* 

Within the district evaluation office, a full-time evaluation specialist 
is assigned to the Title I evaluation. Close association with the project 
staff results in a cooperative working relationship. The evaluation specialist 
monitors the progress of the program at each site through formalized classroom 
observations 5 test results, and monitoring of the completion of specified 
Instructional objectives. 

To ensure that the evaluation roaains objective, the State Departmmt 
of Education 5 ''strongly suggested" that an outside contractor be hired to 
review the evaluation documents produced by the district's unit. 

Case IV * In site the evaluation of the Title VII Bilingual Education 
progrm was conducted by an outside contractor. The evaluation was moni- 
tored by the reiearch/evaluation unit and the program staff. Prior to 
acceptance of the final report, the progrm staff and the evaluation per- 
sonnel requested that additional anaiyaes be performed. In this way, 
through technical assistance from the evaluation unit, the program staff 
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were able to make certain their informtional needs were satisfied and 
that the evaluation information was valid. 

The working relationship betx^een the evaluation unit and the program 
staff IB also depicted by their collaboration in the devalopoent of entry 
and exit criteria for participants involved in the Bilingual program. 
" The evaluation process In this district regularly involves ^a -mbxn^ 
effort of outside contractors ani evaluation unit staff. A smilar arrange- 
ment was observed in sites C and B* 

Case V In site E- the evaluation unit is responsible for assessing 
mfnance with thi regulations for P.L. 94-142 (Education for the Ha^i= 
capped), information is gather^ through parent, citizen, community agency 
su??eys, analysis of a random sample of Individual Education Plans (lEPs), 

and teacher/administrator surveys. jj.^. -i f„r,A^no 

the request of the program director and through additional funding 
(about $5000). an examination of the operation and cost effectiveness of 
service delivery ^s undertaken. This evaluation entailed an analysis of 
the amount o^ tLa devoted to program tasks, the number of students served 
across numerous schools, and ratings of the quality of ^"^^^^^ J^^ /^^^ 
provided. Document analysis and survey procedures were employed fo'^ 
quality of service assessment. An alternarive plac"-ment procedure was also 
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Ca^e VI. Paul Rost (1980), from the Albuquerque Public Schools, report^ 
iHhhse of a four^component planning, evaluation and ^i-fj-^^- J^^'^ 
that attempts to facilitate the use of evaluative information 
planning If Title I. A manual, referred to as £££HSS||||i^^-" 
Lveloped to provide teachers with an indication of what each test itan 
was designed L measure. By reporting test results on ^""-^y-^^Jd ilv 
basis, the teacher, through the use of the Item Documentat^, can readily 
determine which skills should be focused on for each child. 

A School Report synthesizes evaluation data in a manner that makes 
it conduHvTtoiranning. The Resource Allocation Plan based on Title I 
needs assessment data. lncludes~TlSSIry of the school's relative nee4 
for uppl^ental assistance. Einally. :m^B-^.^A^-^^ ^^^T 
ducted Here the instructional team reviews the evaluation data with a 
^^he -of^hk valuation staff. The instructional ^^^^ 
interpretation of the evaluation - giving reasons for specific outcomes 
based on events occurring during the school year. 



P^alnaMnn of Title I in Small ^anOar&e_LMg 

With the eKception of the NCES survey of school district's use of 
the USOE Title I models, most of the available evidence on evaluation 
within LEAS focuses on districts with large enrollments or those with 
^vaLLionJrf^;^! units. Nationally, there are nearly 16,000 operating 
districts- less than five percent with enrollments in excess of 10,000 
nJSs Since o^r site visits included Urge LEAs, a broader basis for 

Judging hoi evaluations are conducted was -"T^^^f looo' 
survey directed at LEAs in four categories^ enrollments of 250O-50QO 

nnl?s. 5000^10 000-10 000=25.0001 and those with 25,000.or more pupils. 
Sriersonnel responsible for the ^dmtoistration of Title I were ^^f-^ 
and follow-up discussions with specific evaluation personnal were undertaken 
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whenever possible and/or warranted. The Interview questions focused 
on how avaluations were conduct^, who is responsible for their 
conpletlon and how are the results used. 

In describing how evaluations are conducted, it is useful to establish 
how much of the Title I award is devoted to evaluation and the adminis- 
trative placement of evaluation within the LEA. Infornatlon pertinent 
to these issues Is summarized In Table 4. To assay the amount of federal 
money used in the evaluation of Title 1, LEA personnel provided Information 
on the amount of thel- most recent Title I award and the amount devotrf 
to the evaluation effort. The estimates derived from these responses 
show that, across LEAs, about 1.5% of the Title 1 budget la devote to 
evaluation. The smallest percentage for evaluation — .8% was reported 
by districts with 2500-5000 pupils. In terms of actual dollar values 
this is a trivial expenditure for evaluation. The most frequent comaiits 
offered by officials in these LEAs were that (1) costs incurred for 
evaluation were assumed by the district, (2) the evaluation was based on 
a pre-existing teatlng program which add no extra cost, or C3) only 
testing material were purchased out of Title I funds. Of the remaining 
LEA clusters, the average set-aside ranged between 1.6X and 1.8%. For 
large grants, these set-asldes are sufficient to hire multiple full-time 
personnel. Using the percentage of the grant award Is a convenient way to 
sunanarlze the extent to which federal money Is allocated to evaluation 
but does not take into consideration the size of the grant nor the 
diversity across LEAs. Figure 1 provides a graphic display of the diver- 
sity among LEAs with respKt to the actual amount of iwney devotM to 
evaluation across Title I awards. The most striking aspect of Figure 

1 IS the diversity of money allocated wlthjji each award category. Second 
if we consider the actual dollar figure allocated across districts, only ' 
about 38% of the districts reported allocating $10,000 or mora of Title I 
funds for evaluation. Roughly 51% of the districts report spendtne in 
excess of $5000 of Title I funds for evaluation; 16% reported spend no 
federal money for evaluation. 

In many cases, small LEAs reported that "filling out the forms only 
took a few days" so they found little reason to allocate any funds to 
the "evaluation". We found little consensus among LEAs as to how much 
of the budget could or should be devoted to evaluation. 

Table 4 also summarizes InforiMtlon on the administrative placement 
of evaluation within LEAs. Consiatent with our experience from the site 
visits, in the majority (69.4%) of large districts, the evaluation 
personnel do not report to the program officials directly, that is they 
are administratively independent. As the size of the district decreases 
the extent to which evaluation is independent of the program drops to 
roughly 50%; in the smallest districts in our sample, less than one In 
four are administratively independent. If we consider Independence in 
terms of who has control over evaluation expenditures, the degree of 
Independenca accord^ the evaluation is further reduc^ across all 
categoriea of LEAs — 41.7% of the evaluations in large LEAs are fiscally 
Independent of the program- only 14.8% are fiscally Independent In smallest 
LEAs In our sample. Considering administrative and fiscal Independence 
together, the values diminlah even further. If we Judge the Import 
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Table 4 

LEA Title I Evaluation Allocations and Organizational 
^.rrangCTiente: Telephone Survey Responses 



Note I The number of usable responaei for LEAs in each cluster appear 
In parenthesef . 
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(E) 


CF) 


(G) 


LEA Category 
(liirollment) 


Number 
Con- 
tacted 


Response 
Rate 


Parcent 

for 
Evaiua- 
tlon 


Admin is-- 
trAtiif^l V 
Indepen- 
dent 


Fiscally 

I T^fl f^T^ 

dent 


Both 
"h ^ F 

u Of a 


Beyond 
Reguls' 
tions 


25,000 and 
above 


4Q 


100.0^ 
(40) 


1.6% 
(38) 


69.4% 
(36) 


41.7% 
(3 6) 


36.1% 
(36) 


86.4% 
(37) 


10,000 to 
25,000 


38 


94.7% 
(36) 


1.7% 
(32) 


45.5% 
(33) 


21.2% 
(33) 


21.2% 
(33) 


80.0% 
(35) 


5,000 to 
10,000 


38 


86.8% 
(33) 


i.s;. 

(33) 


50.0% 
(28) 


21.4% 
(28) 


17.9% 
(28) 


75.8% 
(33) 


2,500 to 
5,000 


34 


82.4% 
(28) 


,8% 
(27) 


22.2% 
(27) 


14.8% 
(27) 


11.1% 
(27) 


39.3% 
(28) 




150 


91.3% 

(137) 


1.5% 
(129) 


48.4% 
(124) 


25.8% 
(124) 


22.6% 
(124) 


72.2% 
(133) 
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Dollar Alloeitlen 
for Evaluition 
froi tht Title I 
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of these figuTtfu from the cantsst of oyr discuss iens with LEA personnel, 

the lack oi; f iscal and a&siftlBfivatlve independ«ir.« does not usually 
resu lt to constraints on bow m& iiow wall av;:iTK3tions gra conducted - 
theT^,! are noteworthy mceptixmti to this gsrWrwillaatlDK, however. These 
conytraliatH are addre>3sud in Cliapter 5, 

With respect co how Title I fivaiuations are conducted, two major 
issues are pertw^nti the mtmt to whUh cousistent procedures are 
employed across i-Ms and the oKfient to ^rtilfth LEAs supplanent the re- 
quired evaluar.JOris in ortirae to satlify their own needs. With regards 
to the firer qii««tion, 85.3% of the reported using one of the three 
USOE Title ; raadels, thm mat predocilnant (80.1%) being the norm- 

,refeirenc4d himml A, Consiatfifit with the MCES su^'vey, only 5 percent of 
the LMi ufc-ed the Model B, C, or an approved alternative design — these 

3r.&.^Sk^imu mmmtmd within lareg school- districts . 

The amond Issue relevant trt how evaluations are done la whether 
and in what ways LEAs supplemert the required Title I evaluation. The 
last colimn of Table 4 shows that with the exception of small districts 
at least three-quarters of LEAs go beyond federal reoortlng requlrCTents 
to some degrees less than 40% of the small districts (2500-5000 pupils) 
report supplementing beyond federal requlrraents. A follow-up question 
on the types of supplemental evaluation activities that are undertaken 
revealed that a variety of strategies are employed. The more sophlsti- 
caced districts perform side-studies to assess the validity of achlevraent 
data, assess Implfflnentatlon of the program, and to determine the extent 
to which the program environment Is different than the comparison group's 
educational environment. Also, elaborate feedback mechanisms have been 
developed to provide evaluative information to teachers, parents and other 
local officials. More typical suppl^ental activities that were reported 
include: one-time surveys of teacher attitude, assessmait of changes in 
the affective domain produced by exposure to inservlce workshops or the 
Title I program, classroom observations, and assessment of locally developed 
material. Judging from material received through site visits and telephone 
solicitations, the quality of these additional activities ranges from poor 
to exceptional. However, for a substantial number of districts with 
Wirollments of 5000 or more, there appears to be at least some effort 
devoted to obtaining information that is suited to the needs of the 
district personnel, 

3.5 HOW EVALUATIONS ARE CONDUCTID AT THE FEDERAL LEVEL 

The preceding section addressed the evaluation of federal programs 
at the state and local levels of government. These evaluations represent 
one aspect of how federal programs are evaluated. The second source of 
evaluative data is derived from national level studies initiated by the 
federal agency or explicitly mandated by the Congress. This section 
focuses on the procedures associated with the latter type of evaluative 
studies. 
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Evaluation within OED 



°^ Evaluation and Dlaseminatlon at USOE eneaees In 
numerous evaluation activities that can be clasalf^^H 7 ?f . 

In^™ ^n S*cSL:LSt^''^i:r* P-cess studies to provide 
by USOE- C3) ?ectalcal . operation of programs administered 

After the declsjon^to^ evaluate. The pf avalu.Mn. 

descrJoti^ o?^Sn""^S °^ * evaluations are conducted, a detailed 
^S^J^^^^^J^^ which programs will be 

«Kecut!S ^fthesfSudJes!''""' '"'^^^^^ ^"^^ 

The essential phases of the OED procedures are as follows: 

Phase 1. An OED staff meaber is assigned to the study shortly 

-fter it is established as a high priority and usually 
remains the project monitor throughout the duration 
Ox tha study t 

Phase 2. The monitor and program staff and additional evaluation 
experts convene a aeries of meetings to discuss the 
nature and scope of the Issues to be addressed by the 
Ihe uLni T^"*"'''' P^^" negotiated at this phase. 
Idlre««/"°i^'^*^*P*'""*="i°'^ °f the questions to be 
^ 1 f ' f?J^«i^«« °f the study, and methods to be 
anployed The feasibility of the evaluation plan is 
considered at this phase through consultation with 
experts In the field and evaluation personnel wlthm 

Phase 3. Having specif led the scope of the study, negotiated 

Thesis °" P ^^-^"^^ Proposal is written. 

These RFPs are explicit as to procedures that are to 
be used In conducting the study, 
fhase 4. A contractor is selected through a competitive bid 
process. 

Phase 5. toce the contractor is selected and the evaluation is 
initiated, the OlD project monitor maintains close 
contact with the contractor throughout each aspect 

Mne?Lf?f^M.-?* -'f^^ P"'P°^^ this monitoring 
function la (1) to make sure that the study Is conducted 
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in accordance with the original planj (2) advise the 
contractor whra problOTS arise which may Jeopardize 
the validity of the study, and (3) msure that study, 
as ^ecutedp mawers the tetended questions. 
Phase 6* At the completion of the contract, the monitor takes 

responsibility for the production of a project smanary 
and ensures that the report is rwiewed and approved 
by the Department and ultimately transmitted to the 
appropriate audiences* 



Types of (^estions addressed and cMracterlstlcs of OED Studi ^ ^ 

In discussing why evaluations are conducted, the type of evaluation 
questions that are addressed served as a functional classification schme* 
In Chapter 2, four categories of questions were offered, aitaillng five 
specific questions. These questions include i 

1. Who is served? 

2. How are services provided? 

3. What are the costs of these services? 

4. What are the effects of these services? 

5. What are the costs or bTOefits of alternative forms 
of service? 

As reported earlier, the nature and cost of service are the most 
frequent questions addresged in federal level evaluations. It was also 
noted that imiltlple questions are addressed to many instances. Since 
the nature of the evaluation effort Is dependeit upon the type(s) of 
questions that are posed, it is useful to look at the complexity of these 
studies with respect to the ntmbM of questions that have b em addressed. 

Applying the classification schMe devised to Chapter 2 to the stoty- 
four (64) studies reported by OED during 1977-1979^ we find that a single 
question Is addressed by 45% of the studies; two questions are addressed 
In 37 ,5% of the cases | and three or torn questions are addressed in 14% 
and 3% of the studies, respectively. The major toference that can be 
drawn from these figures is that, for the most part, OED evaluations are 
focused on rather specific evaluative questions. 

It was also todlcated earlier that contrary to what had previously 
bera claims, only a minority of the studies imdertakm by OED were focused 
on obtaining answers to questions pertaining to the impact of edueatloaal 
programs. Specifically, over the past three years, 25 of the 64 completed 
itudies were focused on impact questionsi that is, answering questlone 
regarding the effects of services on program participants. Of the tw^ty- 
five topact studies, only 16Z (or 4 of 25) were directed at simply looking 
at the ^act of service. Instead, for 84% of the studies, multiple questlo 
were considered. Specif Ically, 36% answered the topact question and one 
additional questloni 36% addressed Impact and two other questionsi and four 
of the five questions were addreesed In 12% of these tapact studies. None 
of the OED studies attempted to gather evidence pertlnmt to all five 
questions* - 

loo 
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So, while the majority of OSr 3ored st are focused on 

mswerlng one or two specific que iSj whm wt us attention on 

just those studies tlmt are aimer iderstanr: ;i :he impact of programs 

on participants, it is apparent lese stv e are rarely focused 

on answering only the Impact que Tncorpp ing diversity as to 

what la examined provides infor^at < whl . arve the needs of 

multiple constituencies not only t > ; oiterasted in the Impact. 

Further, coupling questions regardii. u r ^ ^^h assessmmts of the 

nature and/or cost of service ■ 'ovic complete understanding 

of the reasons for the observec levt ict* 



Gharaeterlstlcs of OED Studies 

Who conducted these studies / i luaar inspection of the 64 evaluationE 
reported in the OED Annual Reports reveals that 60 specific studies were 
conducted, Nmeroua groups were InvDlved in the execution of these studies, 
Of the 60 studies, 5 were conducted by OE staff, 1 by NIE, and the source 
of 2 studies could not be readily identified. The training 52 studies 
were conducted by individuals outside OED, Specifically, 30 different 
groups participated In the conduct of these studies; 22 research firms 
were used, 2 universities, 2 federal agencies other than 01 and 4 organi- 
zations classified as "other," 



How long do these studies take to complete ? The duration of each 
contract can serv^e as a rough approximation as to the length of time that 
Is required to complete an evaluation* The range of the time devoted across 
the OED 52 studies, conducted outside the agency, Is presented below* 



Distribution of Contract Duration for OED Evaluations 



Itonths 


Percent of 


Cinaulative 




Contracts 


Percentaga 


73 or more 


1.9% 


100.0% 


61-.72 


3.8 


98.1 


49-60 


1.9 


94.2 


37-48 


7.7 


92.3 


25-36 


25.0 


84.6 


13-24 


36.5 


59.6 


6-12 


11.5 


23.1 


lass than 6 


11.5 


11.5 



From the distribution it is apparent that there is considerable diversity 
in the duration of contract—the shortest being only one month and the 
longest exceeding 6 years. More typical durations, however, are in the 
range of one to three years. Nearly 85% of the contracts are completed 
within three years, the average being 24,5 months and a median duration 
of 22 months. 
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How much do evaluatloni cost? Because of the confusion over what Is 
and what Isn't considared to be evaluation rasearch^ It li exceptionally 
difficult to obtato an idea of how much is spmt conducting national level 
studies* ^en though the 52 contracts described hers do not represent a 
scioitlf ically bas^ sample of evaluations, exmination of the amount of 
money allocated can be informative as a bench-mark as to the rough coats 
of conducting national evaluatloni. 

The gross J unadjusted dollar values allocated to each contract are 
categorize belowp yieldtag the following distribution. 



Dollar allocation PercTOt of 
(Totai^$52V003 , 061 ) cantracts 



12,000,000 and above 


9. 


,6% 


l.OOOjOOO to 


$2,000,000 


11. 


,5 


750,000 to 


1,000,000 


7. 


,7 


500,000 to 


750,000 


5. 


7 


250,000 to 


500,000 


21. 


1 


100,000 to 


250,000 


23. 


1 


50,000 to 


100,000 


7. 


7 


10,000 to 


50,000 


9. 


6 


below 10,000 


3. 


8 



C^ujLative Percentage of Number of 

percentage total alloca^ contracts 
tion 

100.0% 57.7% 5 

90*4 16.4 6 

78. S 6.9 4 

71.1 3,4 3 

65.4 7.9 11 

44.2 7.0 12 
21.1 .4 4 

13.5 .2 5 
3.8 <.l _ 

100.02 52 



^amtaation of the distribution reveals at least two major classes of 
aatlonal studies. The first may be consider^ as "large scale" studies, 
those costing in eKcess of one million dollars. These are a minority ^ 
accoimting for roughly 21% of the contracts. For this class, the maximum 
allocation was slightly over Sl7 million dollars for the 5 year Sustaining 
Effects Study conduct^ by Systems Development Corporation, More typical 
for the class of evaluations is an allocation of less than $3 million. 
The second class of national studies is reflected in the substantial con- 
centration of awards on the $100^000 to $500,000 range, accounting for 
about 44% of the contracts. 



Givm that the average duration of a contract is on the order of two 
years and most (65,4%) are allocated at less than $500,000, the Impression 
that all national level studies are large-scale, expensive imdertaklngs, 
does not appear to be substantiated. On the other hand, if percentage of 
total dollar allocation ascribed to each category of contracts Is considered, 
the large-seals studies that were conduct^ account for nearly 75Z of the 
money allocate for the 52 studies the Suataintog Effects Study, alone 
accounts for one-third of the $52 million dollar expenditure. While ex- 
pensive, large-scale studies still represmt only a fraction of the activity 
undertake by OED. 

The studies being considered were initiated in the early to mid 1970 *s. 
The focus of evaluations conducted through OED has changed stace that tlmei 
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large medium-scale efforts are less often initiated than in the past. 
Evidence for this change In orientation appears In the form of Increased 
OTphasis on exploratory/evaluablllty assessments and Increased extaidi- 
tures for provision of Technical Assistance. 



„ illustrations of nm ^valuations . Statistical characterisation 

Of evaluations can be informative in that crude parameters can be established 
as to how long they take, what questions are addressed, the number of 
questions that are addressed, and so on. However, beyond these global 
descriptions, there are considerable differences among national studies. 
Tne complexity of the issues addressed and the type of evidence that is 
required Inevitably requires an evaluation design that Is tailore d to the 
specific questions. This tailoring concerns multiple aspects of the evalu- 
at ion process. Namely, meaaures must be selected that are appropriate to 
the objectives of the programi when interviews are conducted, target groups 
need to be Identified and questions developed that are pertinent to each 
group. Design considerations concerning the allocation of individuals 
to the program conditions must be specified or the evaluator must identify 
comparison groups, after the fact and so on. The number of procedural 
details that ultimately contribute to the quality of the evaluation are 
almost impossible to enumerate meaningfully. There are some general themes 
however . = » 

To illustrate the conmonality and variety of methods that are likely 
to be encountered, brief sunmarles of salient procedural aspects of two 
recent OED sponsored studies are provided as Exhibits A and B. Schibit A 
summarizes the Vocational Equity Study conducted by American Institute for 
Research. Exhibit B depicts the procedural aspects of the ESSA, ^tagnet 
School Program Study carried out by Abt Associates. Both studies illustrate 
a number of common procedurea mployed in national level studies; their 
uniqueness Is also apparent. 

As for the connon elanants! Both studies have multiple objectives- 
A sample of schools. Individuals, or other unit is examined; and Informitlon 
is obtained from multiple sources — in both cases, Information from pre- 
viously existing sources and new data collection procedures are anployed. 

On the other hand, there are notable differences In the technical 
aspects of each study. The Vocational Education Equity study us^ a 
scientifically based probability sample in order to meat the Intent of 
the legislation. Here, the legislation requested an assessment of sex 
In equity in aU vocational education programs. The contractor, noted 
that such a wlde-flcale assessment would not be feasible given the number 
and diversity of program sites. The sample was designed to estimflt-,. the 
extKit of inequity aU programs. Within schools, the individuals who 
were intervlewod were also drawn according to smpllng procedures. These 
are expensive procedurea. 

For one phase of the study, the Identification of exemplary projects 
the sample was not sclent If Ically based. Rather, deliberate, purposive. ' 
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ESIIBiT A 

VOCATIONAL EDUCATION EQUITY STUDY 

CONTMCTOR i AMERICA INSTITUTE OF RESEARCH, Palo Alto, California 
AUTHORS ; HARRISON, DAHL AND OTHERS 
INITIATED BY I CONGRESSIONAL I-^NDATE 

PURPOSE i Section 523(2) of the Education Amendments of 1976 (P.L, 94-482) directs 
the Commissioner of Education to conduct a "study of the extent to which aax die-- 
crimination and sex itereotyplng exlsta in all vocati programs assisted under 
the Vocational Education Act of 1963," The mandate also requests an aaaessnant of 
the progress that has been made to reduce or eliminate these practices, 

OVERVIEW OF PROCEDURES i Eight objectives for this study were Identified by staff 
at the U,S. Office of Education. These Inciuded^ Assessing the ^tent of sex 
discrimination and stereotyping In vocational education programs md the progress 
which has be«i made In reductog sex inequities | General progress which has been 
made towards reducing sex Inequltiesi IdTOtlf Ication and analysis of practices 
and activities that hinder or facilitate equal opportunity at state, local, and 
school levels* as well as external factorsi To identify and analyze programs that 
are successful at reducing sex blas| and to develop criteria by which federal ^ 
state J and local agencies can measure progress towards equal opportunity* 

The study conducted by AIR mtailed three primary evaluation components: 

^ Primary data collection 

Secondary data analysis and literature review 
Case studies of selected programs 

SCOPE OF WORK; 

, Interviews with State Agency personnel iJi 49 States and the District of 
Columbia . 

. 100 schools, offering vocational tratatog, were drawn In accordance 
with probability sampling techniques, 

- At each school^ 4 Counselor , 8 Teachers, and 35 Students were selected 
accordljig to probability sanpltag techniques. Background character- 
istics of each respondent , attitudes and perceptions of sex inequity 
were obtained through structure totervlew procedures, 
• Secondary data analyses were performed by a sub-contractor using data 

from a variety of sources* These Included i the Caisus Bureau, the Office 

of Civil Rights, Bureau of Labor Statistics, and torollment data obtateed 

by the Bureau of Adult Education (BOAE) , 

A nationwide search for exCTplary and promising projects that attmpt to 
reduce sex Inequity was conducted by contactljig experts in the field, a 
survey of relevant literature, and from nominations obtained through site 
visits. Twelve sites were visited and documented as case studies* A re^ 
view of the available evidence, documenting the projects' eff actlvenesB, 
served as a criterion for their being Included as a case study. Twelve 
other sites were described as promising and given a brief writeup in the 
report. 

FmrniNG LEVEL i $957 , 9^8 



DroATION ; 24 months 
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EXHIBIT B 



STUDY OF THE EMERGENCY Sfl^nnL^^D^CTM GNEL^SCHOOL PROGRAM 
SmSmm ABT ASSOCIATES. INC., Cambridge, ^b.sachusetts 

^1"^' '"^"^'^ ™ CHERYLL SI>.ONS. 

jNITIATED by; ; USOE 

IMPOSE: This study had four major objectives: 

mLnL'"'^' ESAA magnet schools, and other comparison 
magnet schools, including characteristics of the 
school districtsi 

a° ri::eg^^«ati:niLiff^""^"'^^ 

programranf ' -^-^ 

Cd) To examine the operation of the ESAA Magnet School Program. 
OVERVIW OF PROCEDURES • 

The study entailed 3 prtaary evaluation activities 
. primary data collection 



SCOPE OF WORK: 



18 school districts were selecced, at each district at least 
te^' Visited by two members of 'tS ^ffg.eh 

anf; It on characteristics of the school 

"te visits t" ^^n'"'' « random. 'Ihe 

n threl tes n toS P'^-^dures were pilot tested 

Data coll^ctio; i^riuded'' '"^ "7 magnet schools. 

(1) unstructured open ended interviews with district administrators 
principals teachers (in magnet and nonmagnet schools) p"ent"' 

(2) quantitative data included school enrollment reporLs 1970 
census for dmegrapl.ic data, desegregation plans for court 

oulreS^T*" °^ the magnet schools as a desegregation device re= 
quired the construction of two indices, subscription rate and racial 
balance. These were dorived from enrollment daL. prolLteS enroll^ 
ment data, for minority and majority students. The criterion for 
success was established as a subscription rate of 1.0 



u^i , '^--—-t^wiww iate wi or more. 

Racial balance was determined as the difference between majority 
d^strfern - ^"^^".^Ption rates. Longitudinal assessment of ^ 
district desegregation, lii magnet and non-magnet schools was 
undertaken using a conmion IndeK of Dissimilarity. Lterracia! 
contact and a ratio of the two Indices interracial 

* Prolect "^^^ """f effectiveness were Isolated. 

. Project selection, criteria for review and program operation were 

alao examined. 

FUNDING LEVEL f $2 58 , 82 7 

ErJc ^^^^iONi 16 inonths 
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sampling raa undertaken to find exemplary practices. This type of 
purpoaive Sfflnpling is comnon practice in national studies that attempt 
to identify «cmplary programs* Aa a procedure ^ it is well suited 
given the intent of the study — it is not adequate for eattoating 
effects s however. 

In the ESAA, Magnet School Study^ districts were categorized 
according to general characterlstice and then randomly selected from 
each category. The major concern here ras that the selected districts 
ware representative of the variety of cont«ts and conditions that Magnet 
schools operate. Since this study was not designed to estimate the 
extent to which all ESAA Magnet Schools Impact on desegregation, the 
attention to eampling of interviewers at each site mm not relevant , as 
it was in the Vocational Education Equity Study, 

Description of the procedures employed in national studies , that 
go into relentless detail , are not warranted in a report of this type. 
However s the important aspect of this discussion is to point out the 
necessity of matching the procedural aspects of the evaluation with the 
nature of the question. Estimating the overall effects of a program or 
policy is technically more complicated than other types of questions — 
as a result they are more costly, Thus^ the importance of the clarity 
with which the question was asked can drastically Impact the level of 
effort devoted to a particular study, 

gyaluation in the Bureau of Education for the^^dlcapped 

The Education for All Handicapped Children Act of 1975 (P,L, 94-142) 
was not put into effect until October of 1977* According to Mary Kennedy ^ 
the delay between the time the law was passed and its eventual effective 
date allowed for the development of an evaluation plan encompassing 
agencies at each level of government. The core of the valuation plan 
revolves around an analysis of the inforaatlon needs of various audiences 
and stipulations in the law. Six major questions ultimately ©merged from 
the BEH assesmnent of informational needs. Since 1976 ^ 20 Special Studies 
have been initiated or conducted, all of them are or have been conducted 
by outside contractors. Each study has addressed one or more of the six 
basic questions identified by BEH staff , The percentage of litudles that 
have been conducted for each question appear below. 



1, To what extent are Intended beneficiaries being served? 40% 

2* In what settings are they being served? 20% 

3, What services are being provided? 25% 

4, What administrative mechanisms are in place? 20% 

5, tftiat are the consequences of implementing the law? 60% 

6, To what extent la the intent of the law being met? 657 
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The percentages do not total 100% bacause multiple questions were 
addrassed by many studlas. Eleven of the 20 studies addressed two 
issues, 6 addressed only one issue and three studies span aspects of 
all 6 questions. 

Of particular relevance. Is the type of information obtained 
through these studies. Prior to the implenjentation of the law, a 
series of assessments was conducted to ascertain state capabilities 
to respond to reporting requirements. Since the State counts, depleting 
the number of handicapped children and their condltion(s) , are the basis 
for allocation of funds, a study was initiated to detarmina the validity 
of these counts and methods for validating future counts were formalized. 
Next, assessment of state definitions was undertaken in order to account 
for state-to-state variations in the number of handicapping conditions 
that are reported. In addition, an assessment of the difficulty of 
implementing Individualized Educational Programs was undertaken. Each 
of these studies provided Information that was ultimately used in formu- 
lating policy for reporting raquirements , 

Subsequent efforts have included analysis of data obtained from 
states, a five-year study of a sample of school districts In which progress 
In Implementing the law is being observed, and an 18 month study of the 
first year of implementation at nine local school systems. The Congress, 
in the legislation, mandated a national survey of the nature and quality 
of Individualized educational programs, this three-year effort was 
initiated in 1977. 

In 1978, a series of 5-year longitudinal case studies to ascertain 
the consequences of P.L. 94-142 on various participants was Initiated. 
Each of these is funded at between fifty and sixty thousand dollars per 
studies were initiated to address specific issues associated 
with the implefflentatlon of the law and Its impact on leaming disabled 
students, parent activists, quality of educational services, differential 
Impact for students with various handicapping conditions, and school- 
parent relations. 

The BEH Evaluation plan. The foregoing describes the Special Studies 
component of the evaluation of P.L. 94-142, other sources of evidence are 
used in the overall evaluation of P.L. 94-142. Kennedy describes how 
individual sources of evaluatlva Information are Incorporated Into the 
overall evaluation. The basic concern was to ensure "breadth" (national 
coverage) and depth, or rather, an understanding of the processes associated 
with the Implementation of the law. Four sources of information comprise 
the overall evaluation of P.L. 94-142. These includes (1) descriptive 
data from state agencies | (2) data colleeted through federal monitoring of 
17 specific provisions of the act| (3) Independent surveys and (4) case 
studies and other small-scale focused studies. Whereas the descriptive 
data obtained from the state provides national coverage of the effects of 
implementing 94-142, the special studies are intended to complement these 
data In that they provide relevant Information as to the process of Imple- 
mentatlng the law and its Impact on the quality of the educational process. 

This overall plan represents an admirable use of multiple sources of 
data which mlnlmizea the reportdng burden placed on SlAa and LEAs yet 
provides relevant information for understanding the Impact of ImpiemantlnB 
the law, *^ * 
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CH^TER 4* WHAT ARM THE CAP^ILITIES OF EVALUATORS? 



Georglne Plon, David S, Cordray, 
and Robert F, Boruch 



"The evaluators . . .were BprMd throughout .. .each endowad with the special 
gift of their own group, and each using that gift in a ipeclal way.... 
But does that not aake for much arguing among evaluatora about who has the 
most special gift of all?" 



In Patton, 1980. 
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4. WH4T ARE THE CAPABILITIES OF EVALUATORS? 
m 

Chapter 4 presents a preliminary eMmlnation of the capabilities of 
evaluation perforflters. The first section addresses some of the miecon- 
ceptions assooisted with this topic. The next two sections are devoted 
to enumerating the problems associated with the Identification of evalu- 
ators and ascertaining their capabilities. In Section 4*4., the institu- 
tional content £*nd resouraee for evaluation are described with respect to 
their Influence on capabilities. Evaluation tasks, capabilities, and the 
match between them are outlined in the fifth section for evaluators In 
federal, state and local education agencies. Given that the use of out- 
side contractors has recently aroused much interest, the final section 
explores some of the salient Issues Involved in the procurement, monitor- 
ing, and capabilities of outside contractors 

4.1 MISCONCEPTIONS OF EVALUATORS AND raEIR CAPABILITIES 

There have persisted various misconceptions concerning evaluators 
and their capabilities. A comon one has been that of "the evaluator" 
who supplies all the necessary skills for any evaluation effort . ^Much 
attention has been devoted to describing the characteristics of this 
"person." While this may accurately depict those situations In which 
one individual Is contracted or assigned to conduct the research, there 
are many Instances where a group of individuals share evaluation respon- 
sibilities. In these cases, judging the capabllltiea of each Individual 
against all tasks required by the evaluation is inappropriate, ^ther, 
capabilities of the group as a whole should be compared against the nec- 
essary tasks . 

In addition, the Idea of capabilities needs to be more broadly con- 
strued to include not only those of the primary evaluation staff, but 
also the various resources which they have at their disposal. For exam- 
ple, access to such facilities as Technical Assistance Centers, inter- 
ested universities, and other agency departments with trained personQel 
must be considered when determinJjig the presence or absence of evaluation 
capabilities. At the mmm time, such factors as money and staffing pat- 
terns cannot be overlooked. These may affect the organiMtlon's ability 
to attract and retain coi^etent Individuals. 

Another misconception is that there is a detailed armentarlum of 
talents necessary for each and eve^ effort associated with evaluation. 
This view does not take into account the possibility that such efforts 
as evaluability assessnents may not require exactly the same set of com- 
petencies as do cost-effectiveness strategies. Responsibility for evalu- 
ation reporting to federal/state education agencies may not require the 
ability to plan, design, and conduct a program evali;tetion. Consequently, 
there is no immutable set of specific competencies against which all in- 
dividuals responsible in some way for educational evaluations can be 
judged. 
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4.2 PROBLMS IN IDENTIFYING EVALUATORS 

In eduaatlonj seyeral of the same problems In Identifying "evaluators'' 
surfaee aa do In other arenas (e»g.| mental health and criminal JustlQs) • 
For BKamplep anyone oan easily assume the title of "evaluator , lAere Is 
no formal licensing proQedure and no long tradition of training or certl^- 
flaatlon. The problem here Is akin to the Identification of engineers In 
the 1930 's and to some eKtent now — 'the school coach might be labeled "evalu-" 
ator'' In the same sense that a boiler mechanic Is labeled ''building engineer." 

Identification of who la and who Is not an evaluator, let alone "conr" 
petentj" cannot simply be achieved through the use of one e^llclt Indicator « 
Firsts Individual Job titles often Inadequately denote the presence of evalu- 
ation responsibilities » While in some SE^ and LEAs there is eKpliclt men- 
tion of "evaluation" in the position title (e.g»| Evaluation Specialists 
and Evaluation Technicians), in other settings these Individuals are known 
as Divisional Assistants or Educational Research Specialists « Part-time 
doctoral students ^ occasionally eniployed to perform evaluation tasks » are 
typically referred to as hourly workers or teii^orary help* Even the inclu- 
sion of departmental or divisional affiliation Is not Instructive since 
many agencies do not have a distinct evaluation office. In these settings ^ 
individuals function in multiple roles— program admlnistratpr , principal, 
and evaluator~and, if querleds would more closely align themselves with 
the administrative or Instructional title and would not regard their tasks 
as evaluative. 

No single characteristic such as evaluation training or eKperlence can 
identify all evaluators . Formal degrees or certification in evaluation are 
rare I given the rec^icy of thp field, and graduate progr^u are new and vary 
substantially In the type and quality of training they provide. Adoption 
of the title of "evaluator" by an individual also does not necessarily Im-* 
ply that he/she possesses the appropriate qualifications or has actually 
conducted an evaluation. The recent recognition that evaluation Is a— ^ 
growing enterprise with employment opportunities may have exacerbated this 
problem. We are reminded of a conversation with an architect who^ following 
the surge of Interest in social ^^act asseasnentSi simply appended "evalu- 
ator" to the specialties listed on his business card. In education. In- 
cluding oneself on a state registry of available "evaluation consultants" 
can be accomplished through a simple phone call. 

Because a large nurfser of people are often Involved in an evaluation, 
the problem of whom to Include becomes Important. Here the question con- 
cerns whether or not the principal Investigator, tiie project director, the 
field coordinator, the Instrmaent developer, or the analysis manager all 
equally qualify as "evaluators." Time allocations and types of evaluation 
activities cannot be rigidly assigned to these roles. For exan^le, the 
principal investigator (s) for major contracted evaluations may assume full-^ 
time responslblll^ for the evaluation or simply serve as a part-time 
"mentor" or fiscal manager. Even targeting only the author (s) of the evalu*^ 
atlon report can be problematic ^ given that authorship Is not always ex- 
plicitly stated. 
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Highlighting the prebleme associated with identifying evaluatora is 
not to denota the futility of the effort but only the compleKlty which 
is introduced into discuesiona of evaluators and their capabilities* One 
cannot simply decide to look at only those Individimls explicitly labeled 
as evaluators" by their Job title or enployoent in an "ivaluation Office" 
as this would exclude a nunber of potentially relevant participants. It 
cannot also be assumed that only one particular individual conducts the 
evaluation and therefore Is "the evaluator." Reliance on regis t^ listings 
of evaluators or "consultants" may result in the Inclusion of individuals 
who have never been involved in educational evaliations. We have tried t© 
be sensitive to these issues in our research and ermine those people who 
currently have major responsibility for evaluation efforts related to 
federal education programs. These activities include large-scale national 
evaluations of federally funded education programs, evaluation reportlag 
to f ederal/state agencies, and discretionary evaluation efforts related 
to these programs at the state and local levels. 

4.3 SPECIFYING THE LABILITIES OF EV^UATORS 

In eKamining the capabilities of evaluators, two questions emerge i 
(1) what are the capabilities required for evaluation? and (2) how can 
these be understood? Concerning the first question, many individuals In 
academla (e.g., Anderson and Ball), private contracting firms (e.g.. 
Ingle and Klauss), and urban school districts (e.g., Webster and Stuffle- 
beto) have attempted to delineate the skills necessary for ^.valuation. 
However, even these es^erts have often failed to reach perfecc agreement. 
Aside from crude differences In the numbers of skills and terminology, 
there also have been differential e^hases attached to vajflQus competency 
areas* While the majority of individuals have ineiuded such competency 
areas as managerial and comunicative styles, technical psroflclency has 
typically received the greatest attention. In contrast, iucluslon of 
capabilities related to programatlc skills (i.e*, sabotantlve knowlodge 
in the program area being eval^ted), policy --related expertise , and 

credibility" has been less un^o^. Enumerations of tha specific levels 
of proficiency needed for various skill areas have ales differed. Given 
the early state of development of the profession, this diversity is not 
surprising. 

Factors Requiring Recognition in Specifications of Cap^i litlea 

What has resulted from this variation, however, has been conflicting 
images of the evaluator* One describes a technically sophisticated mn- 
pert, facile in all evaluation and research methodologies, -toother which 
is coMsnly evoked by mom texts and short training sessions suggests that 
anyone can become an evaluator after surveying a "cookbook" of evaluation 
methods or attending a three-day workshop. This confusion stems from a 
ni^er of factors, but most importantly, from differing assumptions of the 
tasks involved in evaluation* For example, the model of the "lone evalu- 
ator" requires a wider range of competencies within one individual than 
does Che notion of "an evali^tion team" where indlvlduiJ,a are assigned .a 
llMted nunber of specialiied tasks. Above all, the tasks required for 
simple evaluation reporting to federal/state agencies do not warrant the 
same degree of proficiency in many co^etency areas as does engaging In 
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diacretlenary evaluation afforts which go well beyond compllanca with fed- 
eral/state regulations* Consequently , capabilitlee must be natchad to the 
evaluation tasks which are asaigned« 

Pragmatic considerationa cannot be overlooked* Evaluation resources s 
in terma of funding, time, and staffing levels , play an ii^ortant role* 
Organiiatlona operate within visible conatraints — if positions are not al-* 
located for evaluation, it is i^ossible to hire any staff, let alone per-- 
sons with evaluation training and es^erience. If evaluation funds are 
miniscule, it becomes difficult to recruit Individuals with sophisticated 
backgrounds, given the cost of professionally skilled labor* The lack of 
nearby universities or centers with evaluation expertise makes con^etent 
advice, however brief, difficult to obtain. Such issues suggest that 
these factors cannot be Ignored in discussions of evaluation co^etencyp 

Gratuitous Crltlelsm 

Ignoring factors such as the ones specified ^ove often results in 
less-than-constructive criticism of evaluation capabilities* In judging 
capabilities, it is difficult to resist judging coD^etency baaed on what 
activities one thinks should be occurring vs* what activities are required. 
For example, districts and state education agenclfii who merely but eom^ 
petently engage in required evaluation reporting activities cannot be as-- 
sumed to be incoi^etent or even incapable of engaging in additional dis- 
cretionary efforts— they may be sl^ly prohibited from doing so* Instances 
may exist where co^onentwise testing md assessment of program variations 
are impossibilities, given Inadequate funding and adndnistrative support. 

Given these conditions, we believe that rather than an Ideal evalu- 
ator, there Is only an ideal evaluation . The elements of "good evalua- 
tion practice" have been set forth earlier in this report* The ability 
to successfully inclement and execute these tasks rests partly on the 
Individuals involved and partly on the available resourcesT Both of these 
will be addressed in this chapter. 

Indicators of Capabilities 

With these considerations In mind, assessing capabilities requires 
the simultaneous use of several crude indicators* First ,. relying on for^ 
mal training in evaluation and research Is in^ortant but Insufficient by 
itself. Graduation from an accredited program with a formal degree or 
certificate in evaluation is not a comon phenomenon, given the recent 
emergence of the profession* Consequently, evaluation ei^erlence, pro^ 
fesslonal mCT^ership in organlMtions, and productivity in tmtwm of pub- 
Jflcaelons, presented papers, and technical reports have also been se- 
lected as indicative of evaluation capability* 

Information relating to Institutional evaluation resourcea will also 
be presented* ^is Includes such factors as evaluation funding, nunter 
of professional staff, administrative support, and agency hiring policies. 
It should be noted, however, tJiat we have not had the resources to fully 
characterise the professional capabilities of evaluators In federal, state. 
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and local education agencies over an extended peripd of time and in all 
their complexity* 

4*4 THE ORGANlMTIOm FACTORS AFFECTING EVALUATION ACTIVITIES AND 
CAPMILITIES 

Before specifically examining capabilities, it is neceasary to de- 
vote sonie attention to the organisations which support and conduct the 
process, Thm types of evaluation activities undertaken and the mnner 
in which they perforMd are at least partly determined by the organ- 
iiational resources which are allocated to evaluation* tte salient as- 
pects include the amount of money coTOltted to evaluation, the ni^er 
of professional staff assigned to evaluation efforts* where evaluation 
Is placed within the organisational hierarchy, and how this location af- 
fects the relationahip of evaluation to the program under scrutiny. 
These interrelated factors play a role in influencing the types of capa- 
bilities available to carry out evaluation activities and the quality 
of the resulting research. 

The relationships of Mney to evaluation capabilities and conduct 
are not obscure. Federal monies constitute the primary mechanism for 
supporting evaluation activities and evaluators associated with federal 
education programs. The amounts allocated to these line items often 
determine the nature of evaluation efforts and the types of individuals 
which the organization can attract to carry out the tasks, mmn Indi- 
vidual programs are small in slse and funding, evaluation monies may be 
mlniscule. For example, one state official blamed the Mnlmal amounts 
of evaluation monies (averaging $500-600) as 'partly respOTSible for the 
poor quality of contractors and evaluation reports In the small local 
Title VII programs within the state. Some states and districts do, how- 
ever, augment federal funding for evaluation and employ aufficient num- 
bers of highly trained personnel to conduct these efforts. In Site E, 
evaluation activities for the handicapped progr^ funded by P, L. 94- 
142 were Improved due to additional district allocations for the evalu- 
ation. These suppleMntary fimds perMtted the program to hire two 
full-time professionals from the evaluation unit to work on Special 
Education program evaluations exclusively* 

In discussing evaluation capabilities. It is also important to fo- 
cus on the location (s) of evaluation within the orgmiizatlon. P^ere 
evaluation is placed partly reflects an Initial conmitment by the ad- 
nd^nistration to producing accurate information and an understanding of 
the resources required for evaluation* Various types of arrangements 
exist, placing evaluation "inside" or "outside" the agency. These also 
Incorporate different relationships between evaluation and the program 
imder study. Certain types of arrangements are more prevalent within a 
particular educational juriadlction. While evaluations at the federal 
level usually Involve direct grants and the outside contractor model, 
some efforts are conducted by agency personnel. At the state and local 
levels, where both state atolnlstered and direct grants are ei^loyed, 
there may be a nuiier of organizational arrangements, as outlined by 
the following diagram. 
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Rel^ lonshlp to Program 
Within the Program Outside the Program 



Agancy 
Pareonnal 



NDn^agency 
Personnel 



1. Program director 

2. Program ataff with other 
responslbllltlas haaldes 
evaluation 

3« Progrm staff whose sole 
responsibility Is evalua- 
tion 



Review tm§mB from the 
federal, state, or county 
program offices 



1* 



Director and/or staff 
within an evaluation^ 
research, or testing 
unit 



Contractor outside 
the sponsoring agency 



It should be noted that for any given organiEatlon, multiple arrange- 
ments can coeKist either across or within programs* For example, the QAO 
report of faderal educational evaluations found that 27% of the local dis- 
trict Title I evaluations and 31% of the Title VII evaluations were con- 
ducted by both outside consultants and inhousa agency staff. Other educa"- 
tional jurisdictions may also play a role in determining tha organisational 
placement for evaluation. In State II, an area populated with small dis- 
tricts, the SEA "highly reco^Mnds'* that local Title I evaluations be con- 
ducted by an outside contractor* Debates are occurring la Congress over 
potentially restricting the use of eKtaraal "consultants" and shifting 
their activities, including the conduct of evaluations to federal agency 
staff. The decision concerning where evaluation responsibilities should 
be located partly centers around Issues related to Inadequate inhouse re-^ 
sources (e*g,, time and staff eaqsertlse) and partly around the desire to 
Increase the "objectivity" of the evaluation* We have found that the "in- 
side vs. outside the agency" distinction Is less crucial to the independence 
of the evaluation than the relationship of evaluation to the program imder 
study. 

Federal Education Agencies 

Federal agencies charged with the support and conduct of evaluations 
have bean enumerated in the Introductory chapter* Given the number of 
organisations Involved » our primary focus has been on the Office of Evalu-* 
ation and Dissemination iOW) because of its major responsibility for spon- 
iorlng and conducting evaluations of federal educational programs. 
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. The Office of Evaluation and D l sgemlnatlon (QEDK Prior to the cre- 
Department of Education, the evaluation unit within OED 
was divided Into three subunltss Elementary and Secondary Education 
P^n^«^°°"^A^ Education, and Occupational, Handicapped, and Developient 
Programs. Approximately 40 full-time professional staff were .™ioyed. 
The strategy used for carding out its assigned evaluation responsibil- 
ities has been the outside contractor model. This typically Involves 
the co^etitive bid process with an OED staff meiier assigned as project 
monitor. Not only are specific evaluation studies conducted in this man- 
ner, but also such activities as the development of state and local evalu- 
to fupepision of Title I technical assistance provided 

to SEAS. It should also be noted that Individual projects have been 
undertaken by OED staff (e.g., the development of a series of technical 
reports on the packaging of student financial assistance). 

Since 1973, OED has admlnlsterBd an average of $21.4 million annu- 
ally In evaluation contracts. As Indicated in Table 1, this money Is 
derived from a variety of sources such as Title I funds and OED 's own 
discretionary funds (labeled Plannlni and Evaluation) . In total the 
amounts allocated to the agency in 1979 and 1980 are lower than the 
peak funding year of 1978. What Is interesting to note is the declining 
ZiTa^n rf" Evaluation. According to conversations 

with OED officials, these funds are typically e^loyed to Initiate and 
conduct studies In response to program managers* requests. 

Under the new Department of Education, the evaluation staff for- 
merly within OED are now assigned to the Division of Program Evaluation- 
one of the three units within the Office of the Deputy Assistant Secre- 
tary for Evaluation and Program Management. The subdivisions within 
this new unit and their overall responsibilities have essentially re- 
mained the same. This unit Is administratively independent from mm 
various program offices, reporting to the Deputy Assistant Secretary 
tor Evaluation and Program Management. 



State Education Agencies 

As outlined in the previous chapter, state education agencies play 
a major role in the evaluation process. Depending on the type of grant- 
ing strategy Bi^loyed by a specific federal program, the SEA fulfills 
such responsibilities as monitoring the compliance of Its districts with 
federal evaluation guidelines, aggregating, analyzing and reporting data 
on the state-wide Intact of federal programs, and ensuring that LEAs re- 
ceive proper technical assistance in program development and evaluation 
efforts. The SEA can also engage in its own discretionary evaluation 
activities for federal prograM, and, in many cases, be instrumental In 

setting the tone for evaluation In Its respective LEAs. As Mathls and 
Walling have noted in their report on SEA research and evaluation organ- 
izations , the role of the SEA has recently become more proactive in na- 
ture. This la at least partly due to such factors as the impetus of 
court decisions on equal opportunity and funding, public disenchantment 
with education, and increasing interest in accountability. 

Illustrative case studies. Responsibilities for evaluation may be 

115 



4-8 ■ 



Table 1 

^IJNTS OBLIGATED FOR EViULUATIOM CONmCTS OF OED 
CCn ollliona of dollars by fiscal year) 
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* less .55 million transferred to the National Institute of Education for 
completion of the study of Compensatory Education mandated by P.L. 93-380. 

** reduced level in response to DHEW celling on consultant services 

Sourcei OED. Aimual Evaluation Report on Programs Administered by the U.S. Office 
of Education I Fiscal Year 1979. 
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asBlgned to vaifioua offices within the SEA. In some states there la a unit 
distlnet from program divisions while In others evaluation efforts are 
scattered throughout the agency and nested within the Individual programs* 
Thm use of multiple strategies Is also coflan^n, such as program staff for 
evaluation reporting to federal agencies and outside contractors for sup- 
plementary full-fledged evaluations- Information obtained from our site 
visits may better highlight the diversity which exists. 



State II. This SEA serves almost 300 districts and received over $100 
million in federal educational monies (approximately 4% of its total 
revenues) in 1979* All four pro-ams of Interest to this study (Title 
I* Title VIIp Special Education, and Vocational Education) eiistp and 
there aM also 3-4 state-financed educational programs, within the SEA* 
Mthough there is a distinct Bureau of Research and Assessment, it Is 
not assigned evaluation responsibilities and only occasionally provides 
technical assistance in evaluation to individual program. There Is 
one full-time evaluation position within the Title I Program Office, 
and there is a separate Title VII Technical Asslstanaa Project Office 
within the Bilingual Iducatlon Program* For Special Education, a Pro- 
gram Development and Evaluation Bureau is housed inside the program. 
Occupational Education has its own Bureau of Placing, Research, and 
Evaluation. This organizational arrangement makes the separation of 
evaluation from the progr^ under scrutiny inposslble* Although 
many individuals participate in evaluation activities, only three are 
considered to be "evaluation" professionals by the staff within the 
SEA. Outside contractors are occasionally enqjloyed to conduct special 
projMts for Occupational Education, Special Education, and Title VII. 
While evaluation is an Interest of the central SEA administration, 
sparking an liAousa study of the agency's evaluation duties and qapa- 
bill ties, there is little move to centralize evaluation within the 
agency . 



State VI. The budget for the Research and Evaluation Office in this 
SEA totalled over $4 million in 1979, representing an aggregation of 
20 separate state and federal funding sources, in federal education 
monies alone, the state received over |600 Allien* There are Title I, 
Title VII, Special Education, and Vocational Educational program In 
this SEA. The Evaluation Office has responsibility for evaluating 
all programi, with the exception of Vocational Education. Almost half 
of the imlt's operating funds are supported by federal monies, along 
with almost three-fifths of tixm full-time staff posltiyns* Forty-four 
professionals are assigned to this office, and 32 of these are alloyed 
specifically in evaluation capacities. The office Is organlEed Into 
sub units based on particular program areas (e.g.. Bilingual Education 
Evaluation). Outside contractors are primarily en^loyed to conduct 
studies mandated by the state legislature for its own state-supported 
programs. The unit enjoys considerable visibility in the organigat^lonal 
hierarchy, reporting directly the the Chief State School Officer, and 
is administratively independent from the programs. In fact, it was 
said that the Superintendent has been vmty instrumental in fostering 
the davelopMnt of this unit. 



117 



4-^10 



State IV . This SEA, which governs approxijaately the same number of dls- 
trlats as State VI, received over |300 million in federal educational 
aoi^ea in 1979 and operates some of its own state educational programs # 
^ile Aere Is a distinct program evaluation unlti it is small, having 
only 5 full-time staff positions . Its primary responsibility is evalu- 
ation for Title I and state-supported programs, although several other 
federal programs wi^ evaluation requireo^nts e^id^st within the state « 
The imit is ad^nistratively separate from the program offices but sev^ 
eral layers domi in the organisational hierarchy « Its budget is smll, 
and political factors are such that increased support for evaluation in 
the future appears unlikely « 

The prevalence of evaluation centralisation and coordination . As 
indicated by the case studies presented above i Bn SM n^y or may not 
have a distinct evaluation unit* One source of information on the prev*^ 
alenoy of such units Is an American Educational Research Association 
(AERA) survey conducted by ^this and Walling. They found Uiat the 
majority of states did have identifiable units for research, evaluation, 
assessment I md development actiid,ties. Our phone survey revealed that 
75% of tiie 36 SEAs contacted had distinct evaluation tmita or other of- 
fices C^.gi, Federal Programs) wiilch assumed at least some evaluation 
reeponaibilities. Only in a toinorlty of SEAs was evaluation totally 
nested within all Individual programs* 

Centralization of evaluation , however , varies in degrees. As in- 
dicated in the case studies f evaluation tmits may neither perfom re- 
quired nor supplemental evaluation activities for all programs. For 
example I no evaluation unit in our phone survey had responsibility for 
all four major progr^as chosen by this study. Approximately one third 
of ^e units had some evaluation responsibility for three of these pro- 
grams, 23% had responsibility for two, and another one third had evalu- 
ation responsibility for only one program. A small percentage did not 
evaluate either Title I* Title VII ^ Vocational Education, or Special 
Education program. Most units (77%) had some involvement for Title I 
evaluation, and half of the tmlts participated in Special Education pro*- 
gram evaluation activities i typlcaliy those supported by direct grants. 
Evaluation of Vocational Education progr^^ was the responsibility of 
u^Lits for only one quarter of ttie SMs in states with these federally 
f imded programs , 

What this may suggest is the imderutlliEation of evaluation units* 
Miile the infusion of federal monies and their associated evaluation re* 
quirements have facilitated tiim creation of these offices i this has not 
been true across all states which must comply with these federal evalu- 
ation demands* Even when an evaluation unit has been established, pro- 
gram witli federal evaluation requirements may choose or be imable to 
utilise its services • 

The nature of evaluation activities assigned to specific prograM 
partly Influences ^e location of evaluation within both the organiza- 
tion and the program. For exraple» the types of data required by the 
federal government for Special Education programs tends to place evalu- 
ation in the hmds of progrm persomel rather than evaluation unit 
staff C^ee Chapter 3 for a description of these activities), however, 
there are SEAs where the evaluation unit does assme responsibility for 
suppleiiisntary efforts associated with these progrims. ^en there Is a 
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competently and adequately staffed evaluation unit, it can Impreve the 
quality of efforts, facaitate planning, and Increase the use of In- 
formation in decision-making. Caitrallzatlon of avaluation also helps 
in assuring that evaluation activities are not redundant and to develop- 
ing hiring guidelines which tend to reflect a recognition that certain 
specialised skills are necessary. 

The State Board of Education's and Superintendent's administrative 
philosophy can help in promoting centralization, m&x accountability 
Is a concern, a need for valid information Is perceived, and policy 
leadership Is a goal, there is usually a move to centralize and coor- 
dinate evaluation activities under one roof. For exmple, in State I 
the Superintendent adopted a vigorous role by publicly announcing the 
need for high quality evaluations and the subsequent assignment of thra 
to a distinct evaluation unit with sophisticated skills. 

The admi nistrative independence of evaluation . The location of 
evaluation within the organizational hierarchy differs among SEAs. For 
example, Mathls and Walling found that 32X of the SEA research, assess- 
ment, and evaluation units reported directly to the Chief State School 
Officer (I.e., the Super intaident or the Commissioner), 56% reported to 
an Assistant Comissloner /Superintendent, and 12% reported to a Bureau 
or lower level. Certainly, direct access is Important In creating op- 
portunities for policy input and also typically assures that evaluation 
Is separate from prograamatic constralhts. It may be that this latter 
aspect— administrative indeprndence— Is as important as direct access 
to policy makers. In our phone survey, over 81% of the units with evalu- 
ation responsibilities did not report to the same branch as the Instruc- 
tional components, although a aialler proportion reported directly to 
the Superintendent. In SEAs where there Is no distinct unit, this sep- 
aration from program pressures to present favorable results can be a- 
chieved by the use of an outside contractor. The situation where evalu- 
ation is conducted by program staff precludes the possibility of admin- 
istrative Indepoidence for evaluation. 

Resources for evaluation . Very little is taiown about the resources 
for evaluation within SEAs. Previous research has either tended to only 
look at evaluation units per se or at SEAs as a whole without considering 
that some have evaluation units and some do not and that the number of 
programs for which SEAs have evaluation responsibilities differs among 
states. Despite these problms. Information can be derived from these 
Initial efforts. 

For the 32 SEAs reporting budgetary data in a survey conducted by 
Sharp and Frankel at the Bureau of Social Scimce Research (BSSR) , the 
medlM level of research, development, dissralnatlon, and evaluation (RDD&E) 
expttiditures was $374,000 in Py 1976-77. Half of the SEAs reported less 
than 10 full-time professionals engaged in these efforts. Given that 
evaluation is only one activity within the agency (according to Sharp and 
Frankel, about 38% of staff timm Is dedicated to evaluation and policy 
studies), this suggests much raaller resources for evaluation. It should 
be noted, however, that '9 SEAs did report heftier sums of more than |1 
million for RDD&E activities, and 13 SEAs reported over 20 full-time 
professionals employed In these efforts. These still constitute a minority. 

11.9 
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^this and Walltog found that the mean expeiditure fer units reporting 
to the Aasigtant Cenmlaeloner/Superlntandent level was $2*8 million while 
units reporting to Bureaus and lower levels averaged $342^000 annually. Re-- 
learch units higher in the orgmlzational chain of co^and also had an aver^ 
age of 38 professionals as compared to 6 professionals for units reporting 
to offices further down ixt the hierarchy. The case studies already pre- 
sent^ and otha observations from our site visits suggest that resources 
may diminish as evaluation moves further away frra the chief decision-maker. 
In at least three SEAs, the Super ijitendants played a major role in creating 
capable evaluation units. Not only did they commit adequate sims of money 
to support staff and activities ^ but they also elevated the unit in the 
hierarchy so as to guarantee Indep^dencei ^phasized that progrw evalua^ 
tlons ware to be conducted by the unit, and aisured that high-quality pro-- 
fessionals staffed these offices, topabilities for and r.onduct of both 
state and federal programs profit from such actions. 

Concerning federal evaluation efforts, our phona surveys revealed that 
half of the SEAs with research and evaluation units had 1-2 full-time pro- 
fessionals responsible for evaluation activities associated with federal 
progrffltts. One quarter of the SEAs mployad 3-5 prof asslonalsp 4% had 6-9 
individuals, and 13% had 10-19 full-ttae professional staff assigned to 
these tasks. 

The federal govartuaent is Implicated in this process in terms of the re- 
sources it awards the SEAs for carrying out their required evaluation re- 
sponsibilities for fedaral programs. Our phone surveys and site visits in- 
dicated that the majority of evaluation professionals in SEAs are supported 
by federal money. Funds have also been targeted for strengthening states - 
abllitlas in educational practices (e.g.. Titles XV-G and V) ^ including the 
support of evaluation unit professionals. The reduction of federal monies 
in this area could certainly affect the ntmibare of staff available to con- 
duct evaluation activities for federal progrms* 

It was also rCTarkad that evaluation prospers only in times of economic 
strlngeney whan accountability rears its head. This would suggest that plen- 
tiful tlmas ara ahead for evaluation and its support. At the same time, how- 
ever, accountability can assume many shades of meaning— In State IV, a cam- 
paign to reduce goveriment bureaucracy has included evaluation positions in 
its target areas for decreased personnel. Several of the SEAs with which we 
spoke ware concerned over Impending staff cuts, especially given the amount 
of requests for evaluation from both SEA and LEA programs. Caulley and Smith *s 
survey showed similar problras being expressed by SEA evaluation unit directors. 
We do not yet have an "evaluator index," such as there is in surgical manpower 
research, which can estimate the ^ount of professionals required for a speci- 
fic task, depmding on the complexity of the activity. Consequently, no ak- 
pllcit criteria exist for determining appropriate staff tag levels in terms of 
both talent and nmbers. Before these complaints of Inadequata staffing can 
be pr emptor ily diTOissed, however, we naad a better idea of the resources re- 
quired for the evaluation process* 

Local Education Agencies 

Within local school districts there are three typical locations, similar 
to SEAs, for evaluationi (1) evaluation activities are assigned to staff with- 
in the individual progrfflasi (2) there exists a distinct unit in the district, 
complete with its omi staff, whose responsibilities include evaluation, typ- 
ically for multiple progr^si and (3) individuals from outside the agraey 
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are contraetitd to conduct evaluations, A fourth possibility exists 1 e 
federal, state, regional, or country review teams or centers which kanlie 
district programs. However, this activity typically centers around compli- 
ance monitoring for adherence to program guidelines. 

The prevalence of eval uation centralization. The UCLA study conducted 
by Lyon and others provides support for the prevalence of an evaluation unit 
In districts with over 10,000 airollments. Over half of the districts had 
evaluation units, with 91% assuning at least some responsibility for evalu- 
ating locally funded programs and 76% for evaluating federal/state funded 
programs. iiuiu™ 

However, similar to the situation In SEAs, within any given district 
the existence of a unit does not always Imply extensive centralization of 
evaluation activities. While In Site G, the evaluation unit shouldered 
responsibility for program evaluation activity In the district In Site J 
the evaluation unit simply provided advice on evaluation for Interested 
programs. Declining resources in this dlBtrlct had reduced the evaluation 
unit staff from 22 to 2 full-time professionals, and consequently, the 
unit could not longer be responsible for evaluation activities as was its 
previous practice. 

The administrative i ndependence of evaluation . The organizational lo- 
cation of evaluation within the school district reporting structure has Im- 
portant Implications for the autonomy of evaluators. In the UCLA study of 
district evaluation units, only 37% of the respondents' organizational 
charts demonstrated that the evaluation unit reported directly to the Super- 
intendent. Generally, evaluation offices were more likely to be in one of 
the typical lines of authority (e.g., Instruction, Administration, or Sup- 
port Services) rather than in direct line to the Superintendent. We strongly 
suspect Chat direct linkage to the Superintendent may not be as crucial as 
the organizational branch to which evaluation reports~i.e. , administrative 
vs. programatlc/ instructional. In our site visit smnple while only 2 of 
the 9 evaluation units reported directly to the Superintendent, five re- 
ported to some intermediary unrelated to programs or instruct ionT~niis ad- 
ministrative separation of evaluation from the programs Is what helps to 
guarantee independence of the evaluators from program dmands. When the 
evaluation unit reports to the same authority (other than the Superintendent) 
as the programs, there may be subtle pressures applied to present favorable 
results and the linkage of job security to positive evaluation findings 
Districts which do not have evaluation units and assign evaluation respon- 
sibilities to progrm staff are also hampered In this respect. The use of 
outside contractors, t^en there is no independent unit within the LIA is 
one method of obtaining administrative indepmdence. This can be quickly 
eroded, however, If poaltlve results are perceived by the contractor to 
be a condition for rehiring, and further raployment Is the contractor's 
primary concern. 

Factors influencing where evaluation la located. The location of eval- 
uation can play an Important role In determining its relationship to programs 
and the degree of caitralizatlon within the district. The existence of a 
unit tmda to Increase the possibility that evaluators experience freedom 
from progrm demands and pressures. The choice of which arrangmient (s) are 
used depends upon a number of factors. It is partly a function of district 
size. For exmple, the UCLA study conducted by Lyon and others found that 
89% of the metropolitan districts (45,000 or more enrollments) had evalua- 
tion offices while 59% of the large districts (25,000-44,999) and onlv 33% 
of the medium districts (10,000-24,999) had such units. ^ 
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toupled with the size o£ the district is the tendency for large and 
metropolitan LlAs to receive more substantial educational support from 
non-district sources. In many cases, the more money awarded the district 
by multiple state and federal agencies, the more likely that evaluation 
requiranents exist which must be fulfilled by districts. The collection 
and subsequent reporting of this data may be greatly facilitated by a coor- 
dination of evaluation efforts. In fact, Che advent of federal/state eval= 
uation requirements was cited by 72% of the districts in the UCLA sample 
as providing the impetus for creating evaluation offices. Two thirds of 
the evalation units in our site visit sample also indicated that this was 
a major reason for their establishment. Although requiranents are not the 
sole motivation for centralizing evaluation within school districts, they 
certainly have played a role. 

Variations in federal/state funding procedures for evaluation, the 
existence of evaluation funding "ceilings," and other evaluation require- 
ments also determine the organizational arrangement and location. For 
example, there is usually some explicit or implicit and idiosyncratic 
"rule of thumb" used in assigning the percentage of program funds which can 
be appropriately targeted for evaluation activities. When total program 
funding is small (e.g., $100,000), 1-1.5% "set-asides" are typically in- 
sufficient for conducting evaluation efforts. Unless the district itself 
pledges financial support, the program may find itself with Inadequate 
funds to hire its own evaluation staff, procure an outside contractor, or 
even partially contribute to the salary of an evaluation unit staff rarai- 
ber. Thus, evaluation simply becomes appended onto the Job responsibilities 
of existing program personnel who can devote less time to its execution. 

Explicit or implicit "guidelines" as to who should conduct evaluations 
also influence the choice of organizational framework. For Mcample, the 
Title I evaluation office in State II has "urged" Its predominantly small 
LEAs to employ outside contractors for their Title I evaluations. Only 
for those districts demonstrating that they have strong evaluation _ 
capabilities, as evidenced by a distinct evaluation unit, can this 'recoimnen- 
dation" be waived, and the hiring of an external "auditor" for the evaluation 
itself is urged. In our site visits we found that Title VII program 
personnel overwhelmingly perceived the use of outside contractors to be a 
federal requirement , although there is no explicit federal requirement for 
this. Only in Sites F and B did the evaluation unit conduct the evaluation 
and, in the latter case, this wbs in conjunction with an outside contractor. 
This perception, especially in districts with competent and independent 
evaluation units, led to disgruntlement. Conversations with OBE program 
staff augiest that districts are encouraged to choose an^ strategy which 
would result in a high-quality evaluation, but project officers may differ 
in their suggestions to LEAs eoneerning the beat strategy to employ. 
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The district Superintendent and School Board also can be Influential, 
For example, in Site H, a district where program staff performed all 
evaluation activities, the Superintendent expressed little interest In 
program evaluation and saw no need to change this situation for the purpose 
of coordinating efforts or ensuring Independence. In Site C, where there 
was a one-person unit for a large urban school population receiving $5 
million of federal funds ^ the Superintendent was described as too involved 
with political skirmishes and indictments to devote Interest to evaluation. 
In contrast, in Site B, a district half the size as Site C, the unit 
enjoyed considerable support, autonomy, and visibility. The evaluation 
unit had 10 full-time professionals and a budget triple that of Site C. 
One reason for "^hls centered around the Superintendent's philosophy and 
the Assistant Superintendent's extensive background and experience in 
evaluation. The School Board in Site E recently approved the hiring of 
six additional evaluation professionals with primary responsibility for 
executing Board-requested evaluations. Thus, the conmitment by these 
policy-making bodies both determines where evaluation may be placed In 
the organization and the level of available resources for evaluation efforts. 

Resources for evaluation in districts with no evaluation units . In 
general, programs solely relying on their oto staff to conduct evaluations 
have minimal resources to devote to the evaluation process. This may 
result for a variety of reasons—few federal/state programs with explicit 
evaluation requirements operate in the district ^ those programs which do 
operate Involve small numbers of students, or the district central 
administration places little emphasis on program evaluation activities* 
One reason frequently expressed by program administrators in our site 
visit sample concerned the absence of adequate funds for evaluation. In 
some cases, even If programs could commit more money to the evaluation 
process, they are presently struggling to maintain past levels of service 
delivery, given funding cutbacks* Consequently, evaluation tasks simply 
become appended onto the job responsibilities of program personnel who 
seldom have research training. 

Seldom Is an individual with full-time evaluation responsibilities 
hired within the program unit. For example, in Site until recently 
there had been a full-time Vocational Education evaluator, but declining 
resources forced the elimination of this position. Only in Title I 
programs, where more substantial set-asides are permissible^ is this 
staffing arrangement which allows more concentration of effort made 
possible. However, independence Is still absent in these situations. 

Resources for evaluation in districts with evaluation units . According 
to the UCLA study, the typical evaluation uniVs budget is less than $90,000, 
comprising 0,2% of the district's total operating budget. However, there 
is considerable variation across districts (e.g., for 1977-78 evaluation 
budgets in this survey ranged from $2,000 to $4,000,000), and even within 
similar size districts* 

The same study also found that, on the average, only 18% of the 
evaluation unit's budget is supplied by the federal government, with the 
remainder stennnlng from the state and the school district Itself. Less 
than one half of the evaluation unit 's budget is actually devoted to 
evaltj^tlon activities, and the typical unit dedicates only 20%. 
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It is not surprising then that the UCLA study found the majority 
of evaluation offices to be small, e,g, , two or fewer employees. There 
is substantial variation In these staffing levels ^ however. For example, 
in our own site visit observations, units were staffed with anywhere 
from 1 to 22 full-time professionals. Even in districts where enroll-- 
ments and federal funding were similar, evaluation unit staffing patterns 
ranged from 2 to 10 full-time professionals. 

Although there is no consensus as to what constitutes adequately 
staffed evaluation units in terms of the nimiber of prof essionals j Webster and 
Holley have developed some informative guidelines. The possession of at 
least 3 full-'tima professionals (e.g., evaluation managers and specialists) 
was defined as a minimal requirement for a 10,000 pupil unit^ 4 for a 20,000 
student unit, and 17 for a 40,000-50,000 pupil district. Employing these 
criteria, 67% of the units in our site visit sample were severely under- 
staffed, e.gp, 2 professionals for a district with 136,000 students. Given 
that these criteria were developed six years ago and that evaluation 
requirements and requests have multiplied, these prescribed staffing levels 
might even be conservative in terms of present demands. It is not surpris-- 
Ing then that evaluation units differ with regards to the number of programs 
for which they have evaluation responsibilities and the types of activities 
undertaken. 

Approximately half of the Directors of Evaluation units in large school 
districts reported in the Lyon and others* sur\?ey that they did not have a 
budget sufficient to meet "federal, state, and local requests for evaluation 
activities." Nearly 70% said their unit's personnel resources were not ade- 
quate to meet currrat evaluation draands. Approximately 94% said they had 
too few staff. What Is Important here is that evaluation activities do not 
only focus on federal/state requ Ired efforts but involve much more* Conse-=* 
quently, as the number of requests for evaluation Increases, additional 
staffing may be required. Compliance-oriented evaluation activities in terms 
of meeting f ederal/state requlrments necessitate a certain nimiber of staff. 
Engaging in additional evaluation activities for federally funded programs 
may require even more staff, not only in terms of bodies but also in terms 
of more sophisticated competencies. 

Findings from our phone survey also suggest that staffing levels may af- 
fect the quality of evaluation. Districts who took pride in the fact that 
evaluation activities for Title I were minimal had less than one full-time 
equivalent individual responsible for evaluation or assigned responsibilities 
to teachers and administrators. It was boasted that an $850,000 Title 1 pro- 
gram was "evaluated in one week." In contrast, districts who attenpted to 
engage in efforts which went beyond sheer compliance with federal require- 
ments had at least one full-time professional and_ often 2-3 individuals as- 
signed to Title I evaluation activities. These were typically the districts 
where program and evaluation funds were substantially larger than the former 
cases described. 
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4.5 CAPABILITIES FOR EVALUATION 
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The actual execution of these responsibilities incorporates a num- 
bar of activities* Interviews with OED staff concerning their role in 
the contract raonltorlng process itself may provide a sense of the di^ 
verse activities involved and the capabilities required of these Indi- 
viduals . 

Preparation and Issuance of the RFP. The project monitor Is as- 
signed, based on his/her interest and expertise in the specific study 
area, to a particular contract* The preparation of the work statement 
may require such activities as meeting with program personnel in other 
Divisions and Departments, obtaining advlGe on technical matters from 
experts in the fields and surveying pertinent research* A draft of a 
detailed work btatement is prepared and reviewed internally and by se- 
lected others. Changes are incorporateds and the final work statement 
is completed. This includes the objectives of the study, its basic de^ 
sign Ce»g*5 sample selection^ Instruments, and data collection and analy-- 
sis strategies), the specification of evaluation tasks, the types of per- 
sonnel required, supporting documents (e.g., the legislation which initi- 
ated the study and program information), and the criteria for proposal 
review. The RPP is then published in the Comierce Daily , and contractors 
are given a specified number of days to respond. 

Selecti on of the contrantor a nd award of the contract . The project 
officer than asse^les and chairs a panel of experts to review the sub= 
mitted proposals. The proposals are distributed to panel meters, review 
criteria explained, and TCetings convened. Frequently, before the final 
decision is made, two proposals are selected, and the respective bidders 
are requested to submit a "best and final offer," responding to specific 
questions posed by the reviewers and synthesized by the monitor. Once a 
selection has been made, negotiation must commence with the contractor 
concerning such issues as the tasks outlined in the proposal and their 
execution. 

Monitoring of the contract. Monitoring a contract involves over-- 
seeing the study during all phases. For example, this may include pro= 
viding assistance in selecting meters of advisory comittees and at- 
tending their meetings. Monitors can also advise on the development of 
instruments, design of sampling procedures, and the selection of case 
study sites. They are the ones responsible for ensuring that the instru- 
ments designed wend their way through the mage of formal clearance proce-- 
dures. This may involve detailed documentation and persistent monitoring 
of the progress status of the submitted £ orM , Results of pilot test are 
reviewed, and the project officer may travel to selected test sites. Eli- 
citing the cooperation of sites may be involved, along with attending 
training sessions for field personnel* The project monitor must be avail- 
able to provide advice on any questions which surface during the data col-- 
lection. We have heard Project Directors' reports of an average of 2^3 
telephone calls per week to the monitor and as many as 8 calls per day 
during critical states of the research. 

Once analysis has been initiated, the project officer may assist in 
designing the analysis strategy and targeting areas which look unusual 
or especially interesting and require additional probing. In some cases, 
similar types of assistance are provided to the subcontractors. 
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Throughout Che process Che project officer muse keep abreast of 
events within the federal government. For example, Conireasional ' 
hearings may be scheduled where a presentation of prelitninary results 
would be useful, and it is the monitor's role to Ruarantee chat infom,- 
ation IS prepared and made available to the relevant DarMes. Another 
responsibility is to ensure that the results will provide answers to" 
the substantive and policy-related questions which initiated the study 
and to also incorporate other important Information which subsequently 
emerged during the course of the research. Approval of contractor 
invoices also begins with the monitor before continuing through the 
series of signatures required for payment. 

During the final stages of the contract the project officer reviews 
and comments on preliminary drafts of the report. 

Upon completion of the contract. The production of a nontechnical 
concise project summary to be distributed to relevant parties Is the" 
responsibility of the project monitor. It should be noted thatthis may 
not be common practice for many agencies who contract out research/ In 
addition, the monitor checks on how the clearance process is progressing 
ror release of the final report and can be involved in dissemination of 
the report to relevant parties. 

Long after contract terTnlnation the project officer may still field 
inquiries related to the study and provide information to other programs 
and individuals Interested in the findings. 



. Proprlate capabilities. Given this wide range of tasks, It is not 
surprising that the monitoring process has been likened to "doing a 
dissertation four tines over." Even this may be a conservative descrip 
since project officers typically monitor 2-3 projects simultaneously, 
depending on the scope and complexity of the individual studies. Performing 
these tasks raquires intensity and breadth of experience. Capabilities 
include expertise in applied social science research (e.g., statlstlrs. 
design, sampling, and measurement), sensitivity to the problems posed by 
field research, and Interpersonal and policy-related skills. 

The staff in this office generally reflect these capabilities In 
terms of formal training, almost three-quarters of the middle and senior 
level professionals have earned their doctorates or are in the process of 
obtaining them. Over 801 of these degrees are In research-related areas 
with one-half of these specifically focusing on educational theory and 
research (e.g., educational measurement and evaluation, educational 
research, and sociology of education) and half encompasslni such relevant 
tlelds as. social psychology and economics. This points to the diversity 
of expertise which is available within the unit to oversee the evaluations 
and develop policy implications In the areas of legislation, budgeting, 
and program development and administration. 

Given that formal training In evaluation is an Insufficient Indication 
o£ capability, experience in evaluation and applied research Is also 
important. The staff in this unit have acquired this experience In a varletv 
of ways. The Office itself has often served as a training ground and 
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socialization mechanism for these individuals. Over one^thlrd of the 
professionals have been in the office for 6-9 years and approximately 
one^fifth for 10 years or more (at a time when evaluation was a new 
field of inquiry). Staff have also brought to the agency a wealth of 
experience from other sectors. For example, almost 75% have had previous 
experience in LEAs, SEAs, and other federal agencies, and over half of these 
were mployed in specific evaluation capacities. Local and state agency 
exparience is common for those individuals associated with the development 
of evaluation models and monitoring of technical assistance. One-half of 
the staff have also had teaching, management, and/or research experience 
in acadr Ic settings, and approximately one^flfth have previously con- 
ducted .jearch in the private sector. 

Professional activity is also common, and a majority of the staff 
have been active in a number of professional organizations (e.g., ^erlcan 
Educational Research Association, National Council on Measurement in 
Education, American Psychological Association, and the Merlcan Statistical 
Association). Individuals have produced technical reports, chapters in 
books, journal articles, and papers for professional meetings. A conserva- 
tive estimate would show that over half of the staff have published since 
their arrival in the office. Of these, over two-thirds have produced 
three papers/articles or more* For example, a few individuals have 
written 8 or more manuscripts in education since the two years they have 
been In the unit. This rate certainly rivals that of many academics. 
Much of this can be attributed to the professional conmitraent and calibre 
of the staff and the promotion of this activity by OED administration. 



Capabilities of State Edu cation Agencies 

As noted in the previous section, the organizational locations and 
resources for evaluation in SEAs are diverse. While some states have 
distinct units in which evaluation is housed and some even have further 
subdivisions focusing on various program areas being evaluated, other 
SEAs have evaluation responsibilities and staff spread across the 
individual programs. Even within those SEAs having specific units, 
staffing patterns vary. In some SEAs the unit enjoys considerable 
visibility in the agency hierarchy, reporting directly to the Chief State 
School Office, in others it almost requires a magnifying glass to Isolate 
it on the organizational chart. 

Previous research which examined the expertise of SEA research 
and evaluation staff has reinforced this impression of diversity/ For 
eKample, Ma this found that 30% of the Individuals ^ployed in SEA research- 
related activities had earned their doctorates, 46X their master's degrees, 
14Z their baccalaureates, and 10% a variety of other credentials. The 
greatest proportion of these degrees were not In research-oriented 
speclaltles-^-only 9% were reported to be in psychologyi 7% in testing, 3% 
in statistics, and 2% in computers. The most frequently reported category 
was administration (29%), with other areas ranging from curriculum to 
geography. 
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Reanalysis of the BSSR data reve-alad that the majority of organizations 
(74%) engaging In research, development, dissemination and evaluation^ act iv^ 
ities within SEAs had at least one^quarter of their staff possessing doctorates, 
primarily in the field of education. Evidence of research and evaluation 
training within this area is difficult to decipher, given that the survey 
categories were relatively broad, ^'Education" encQtnpassed such varied areas 
as educational rasearch, early childhood education, and curriculum. In dis- 
tinguishing between those organizations which stated that evaluation was a 
primary emphasis vs, those who did not, "evaluation-oriented" SEAs had slightly 
greater proportions of their professional staff whose primary expertise was 
in mathmatics and statistics than did agencies not citini evaluation as a 
focus * 

While such overviews of the RDD&E "capabilities** in state educational 
agencies are instructive, the tasks assigned to these agencies must be con^ 
sidered, SEAs are charged with many responsibilities, only one of which is 
evaluation. Even this area includes evaluations for both state and federal 
funded programs. Responsibilities delegated to the state for evaluating 
federal programs include: (1) monitoring to ensure that LEA plans and pro- 
grams comply with federal guidelines, as interpreted by the state; (2) an- 
nual aggregating and reporting of data for these programs on a state-wide 
basis; and (3) technical assistance to LEAs in their program and evaluation 
responsibilities. The range within each of these categories can also vary, 
depending on the specific program. SEAs themselves can adopt the posture 
of servicing requests by LEAs for technical assistance, or as in State VI, 
create a separate state-funded technical assistance progrM, In addition, 
a number of SEAs have assimed additional evaluation responsibilities, such 
as the development of standards and validation procedures, conduct of 
special evaluation studies, SEA and LEA evaluation staff training, and 
vigorous contract monitoring. Consequently, the types of activities as- 
signed to the SEA must be considered in Judging the adequacy of staff 
capabilities. SEAs who engage in discretionary and sophisticated efforts 
require evaluation staff with advanced levels of competencies. 

Illustra tion of tlme-on-task for evaluation activities . To provide 
the reader with some idea of the time Involved in executing evaluation 
tasks, data has been presented from the evaluation unit in State VI, 
This information reinforces the notion that allocations of time can vary 
across activities and program areas being evaluated. It also shows the 
substantial amount of time required by provision of technical assistance 
to both LEAs and other branches of the respective SEA, For example. In 
the Title VII division, staffed by two full-time professionals and one 
part-time individual, over 441 of the person^days is devoted to providing 
LEAs with assistance, and 24% of the time Is allocated to the conduct of 
special studies. Preparation of annual reports to the federal government 
consumes 191 of the person-days in this unit, with the remainder of the 
time b^lng devoted to consultation on matters related to assessment, staff 
development, and liaison with other agencies. Time spent on evaluation 
activities associated with other programs in this SEA can range aa follows i 
generating evaluation questions (3-8%), designing evaluation studies (3^7%), 
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development of insCruments and data collection procedures (8^9Z)» 
collecting data (10--25%) , analysing and Interpreting data (8"19%) , 
writings presentings and disseminating reports (6-20%), providing 
evaluation assistance to LEAs (20-29%) , and providing evaluation 
asslstanca to other SEA staff (2^20%). 

The preceding eKample is based on a unit which has evaluation 
raaponsibllity for most federal programs operating in the stat< and 
engages in sophisticated evaluation efforts. Staff capabilities reflect 
a high degree of research and evaluation training and experience. There 
are both federal monies for evaluation activities and individual state 
cotomltment to the evaluation process . No one can expect SEAs with 2 
or fewer staff to launch efforts which match In scope and complexity 
with this SEA, SEAs lacking distinct units for research and evaluation 
may be prevented from participating in efforts aimed at planning and 
coordinating evaluation activities across various programs and servicing 
departmental requests for supplementary evaluation efforts. 

Where evaluation is placed within the agency also helps to ^Inf luence 
which individuals will be selected to assume evaluation responalblllties. 
Program personnel are usually hired for their program expertise , not for 
their evaluation and research backgrounds* Even when there are full-^tlme 
evaluators within the program, they may end up devoting much of their 
energy to program implementation, given that they may know more about 
the process than the program director. On the other hand^ evaluation 
staff for a distinct unit are often hired with their research expertise 
and training in mind. For examples evaluation job specifications in 
States I and VI require at least an advanced degree in research--related 
fields and prof ess Igtml research experience of all applicants* 

Results from our phone survey reinforced this notion^for SEAs with 
distinct evaluation units ^ staff characteristics in general reflected 
doctoral level research/evaluation training for at least 68% of the units 
surveyed. Within 63% of these units , over half of the staff had earned 
doctorates in such fields as educational research, social psychology, and 
testing* When evaluation responsibility was spread across the Individual 
programs^ it was much more likely that the staff had not acquired formal 
training in evaluation. On-^the^Job training and attendance at workshops 
were more frequently reported in these cases. 

Illustrative case studies . No aggregate statistics reflecting the match 
between SEA evaluation tasks and capabilities can be offered as this would re- 
quire a separate study in Itself* However j relevant case studies from our 
site visits can offer a picture of the capabilities which exist in certain 
SEAs. What will be presented below are short descriptions of three state educa 
tional agencies which typify the range and types of evaluation activities oc= 
currlng at the state level. The capabilities of the staff will be Illustrated. 
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Stat|J^ When fully staffed, this unit employs 41 individuals, 71% of whom 
are full=tlme professionals. It is lodged in a SEA which vigorously pro- 
motes evaluation. Much of this commitment can be attributed to the State 
Superintendent and School Board who have publicly announced that "evalua- 
tions worth doing are worth doing well." Consequently, statements have 
been issued regarding the need for autonomy of evaluation and expert staff. 

Official policy has defined the efforts of the evaluation and assess^ 
ment unit as follows: (1) the planning and development of the statewide 



assessment system; (2) the description and evaluation of the Impact of state 
and federal programs; and (3) the assessment of educational needs. To ac= 
complish these guas, the unit is divided into two main branches: assess- 
ment and evaluation/research, 



The evaluation and research unit Is further divided Into four components. 
One focuses on data collection, another is responsible for coordination and 
dissemination, and two are devoted to evaluation per se. The activities 
associated with evaluation cover a wide range-=both reporting to federal 
agencies and the conduct of special evaluation studies. Evaluators are 
typically^ assigned to a given program and develop an explicit "services of 
agreement with the program for evaluation. This helps in evaluation plan- 
ning, coordination of efforts, and specification of products. Additional 
studies are also conducted. For example, the Special Education program is 
receiving assistance by evaluation unit staff in examining the impact of 
its programs, and special studies on the relationship of certain variables 
to successful Title 1 programs have been performed. 

_ At present, within the two evaluation subunits. there are 9 profes- 
sionals, in addition to the branch supervisor and the director of the 
entire unit These individuals have earned their doctorates in research- 
related fields (e.g., testing, evaluation, and educational psycholoay) 
Experience in educational research is common, and many professional papers 
are produced. The unit has taken advantage of employing interested fellows 
associated with the national Educational Fellowship Program. Technical 
expertise, background in educational theory and methods, and other skills 
exist upon which to draw in conducting evaluation activities. 

1 ■ -l^^^J^^ partly fostered by the actual hiring process Itself. To be 
eligible for evaluation positions. Individuals must have obtained a M A / 
M.S. or 30 graduate hours toward the doctorate in educational research or 
the social sciences. A specified number of courses in statistics, measure- 
ment and design are required. Three years of professional experience in 
educational or other empirical research or a doctorate is the minimum, 
iin oral examination, described as "comprehensive" and "rigorous " is also 
required, covering both theoretical and methodological issues 
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The fact that state positions remain under the jurisdiction of civil 
service imposes certain r£quirements^--that all applicants must have gone 
through the civil service application process* Since this occurs only 
once a year, it results In applicants passing the requirenents but being 
unavailable 10 months later when a vacancy is finally announced. What 
also occurs is that the appropriate Individual is available and taown to 
the agency, but cannot be hired due to his/her falling to have completed 
the application process. 

State II . Evaluation staff in this SEA are scattered throughout the in- 
dividual programs. Although there does exist an office devoted to re= 
search and testrlngs evaluation has not yet been incorporated under its 
job assignments, toly four individuals are distinctly associated with 
evaluation, and they are located in both the specific program offices 
and within the research and testing unit itself. 

Evaluation activities vary, depending on the Individual program, 
^^ile all progrMs must report certain evaluation data to the federal gov- 
ernment, the type of data required varies. Technical assistance to LEAs 
in evaluation has become a major priority for most progrms in the SEA, 
but the extent to which this has been accomplished is dependent upon the 
training of the staff within the program. Only in such federal programs 
as Title I, where there is auxiliary technical assistance provided by 
the federal government and the SEA evaluator has had some research training ^ 
has technical assistance in evaluation been achieved* There is simply in- 
sufficient staff to conduct special evaluation studies within the agency^ 
but fortunately, this SEA is located in a large metropolitan area which 
can attract reputable and experienced contractors. The awarding of a 
contract for state refinments to Title I (Section 183) has permitted the 
Title I evaluator to engage in additional research on Title I evaluation. 

The majority of staff associated with evaluation have their doctorates 
in education. One Individual is pursuing a doctorate in research and evalu- 
ation at a local university. Based on a study conducted by the SEA itself, 
there is a need for additional staff evaluation training In order to up- 
grade technical assistance to LEAs in evaluation. Although staffing levels 
are minimal and research training uncommon ^ particular individuals have 
made contributions and are sensitive to certain evaluation issues. These 
Include the need for more rigorous evaluation requirements for Vocational 
Education programs, the Inclusion of statements regarding the uses of re- 
quired evaluation activities in Title I progr^ improvOTent, and the speci- 
fication of alternative models for evaluation in special education. In 
some of these cases * outside contractors were employed to provide the 
necessary staff and expertise to successfully complete these projects. 

State IV. While this state does have an evaluation unit, it is not staffed 
anywhere near the level of State Vl-^an SEA which has approximately the 
same number of districts and federal programs but seven tjbties the staff 
in evaluation. The professional staff in the evaluation unit total fives 
and this is a fflnaller nimber than existed in previous years* Although the 
SEA is growing in terms of educational enrollments, the new political 
leadership in the state has embarked on a "reduce govermnent bureaucracy" 
campaign, and the evaluation unit faces additional staff cutbacks. 
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To be sure, in our site visits complaints were voiced by districts 
regarding the evaluation capabilities of BM. staff. These expressions 
of dissatisfaction, however, usually focused not on the quality of the 
Services being currently provided by the SM but rather on the additional 
Services which the districts felt should be implemented. For example, 
large competent LEAs want the SEA to conduct special evaluation studies 
and issue waivers to districts so that they theMelves can execute de- 
sired research* Title VII staff in local school districts want to re^ 
ceive technical assistance in evaluation on the same level as their 
Title I counterparts. Programs which employ outside contractors for 
evaluation want guidance as to the appropriate selection and monitoring 
practices in these arrangements* It should be noted, however^ that the 
frequency and intensity of these criticisms often paled against those 
which were leveled at the federal education agencies in terms of their 
lack of coimiunication and feedback to the districts which provide them 
with evaluation data. 



Capabilities of Local Education Agencies 

In many ways the issues related to evaluation capabilities of local 
school districts mirror those discussed for state education agencies. 
For example, the value placed upon evaluation by the central decision 
makers, as evidenced by the location of evaluation within the organiza^ 
tlonal hierarchy and the financial resources committed to the process ^ 
helps to determine the evaluation activities which are undertaken. 
These, in turn^ select the types and levels of capabilities required to 
perform these activities. 

Local school districts and programs must march to the beat of two 
drunmers — both federal and state program and evaluation guidelines* 
Many times these two are compatible — in some instances they are not. 
Compliance with these requirements then forms the comaon denominator 
across all LEAs in terms of evaluation activities* 

Some districts, however, go well beyond these efforts by analysing 
additional program and outcome variables, evaluating programs more fre- 
quently than is actually undated, and exposing their evaluation products 
to validation and review processes* While the SEA can be instrumental in 
fostering such evaluation attitudes and practices, it is also the case 
that individual districts can initiate or facilitate this activity and, 
in certain respects, surpass the SEA in their evaluation efforts. Dis= 
tricts with such track records constitute a small minority. Considering 
the large number of districts with federal programs and evaluation re- 
quirements (e.g*, the NGES survey reported that almost 14^000 LEAs re- 
ceive Title I funds), the majority of effort often does not exceed what 
is required . 

Given this skewed distribution of activities across LMs, it is nec- 
essary to look at capabilities both in terms of activities associated 
with adequately complying with federal/state evaluation requirements and 
engaging in activities which go beyond these mandates* 

When evaluation responsibilities are assigned to individual programs 
and their staff * When evaluation is housed under a program's roof and 
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conducted by program staff, evaluation demands usually are liaited to com- 
pliance with federal/state mandates. Consequently, tasks are concentrated 
towards the collection and reporting of data in pre-specif led ways. There 
may often be no need to develop data collection instruments as the SEA it- 
self designs and implements data collection procedures (e.g., Vocational 
Education programs). While In some programs such as Title I, tests must 
be selected, administered, scored, and analyzed, other programs simply re- 
quire headcounts and fiscal accounting. Building staff may participate 
in such activities by administering achievement tests or employer surveys,' 
and the program staff subsequently ssgregatc the results or simply give 
the raw data to the SEA. One classic example of how Title I evaluation 
requirements are met in small LEAs is that of the school principal who 
scores the tests-, mtches the scores, aggregates the results, and com- 
pletes the required SEA foi 



However, these efforts may not totally encompass the range of 
tasks involved in evaluation. Some program staff do attempt to collect 
more information than required. One determining factor is the amount 
of money available for evaluation. Another key factor is the absence 
of what has been labeled as the "compliance mentality." For example 
while the full-time Title 1 evaluator in Site H simply aggregated 
test scores, reported them in the proper format, and occasionally 
explained testing procedures to building staff, his counterpart in 
Site J spent only 20% of his time on such tasks. The remainder was 
used to interview teachers and principals about evaluation practices 
and observe classrooms for recomnendations regarding Title I program 
implementation. The disparity between these two cases seems to center 
around the attitude towards evaluation. While in the first site it was 
stated that the simpler and briefer evaluation was, the better." the 
evaluator^in Site J perceived federally-required tasks as "mundane 
reporting and consequently attempted to go beyond what was required by the 
state Title VII staff in Site C, while relying on outside contractors 

to produce the required report, felt additional information to be Impor- 
cant for program improvement and generated (although ineptly) some 
implementation data themselves. In many sites Vocational Education 
program staff, who often view state required information as less than 
useful, were Involved In more than the simple collection of enrollments 
and placement rates, spending much of their time visiting classrooms 
to detect program strengths and weaknesses. It should also be noted 
that strictly limiting oneself to compliance with federal/state 
evaluation mandates may not always reflect disinterest in evaluation, 
but rather a feeling that the existing resources (In terms of staff 
and money) are Insufficient to engage in efforts capable of yielding 
accurate information. J' » 

Capabilities of program staff to conduct these federal/state 
required evaluation tasks, let alone engage in discretionary efforts, 
a- fl minimal. Training backgrounds of program administrators and staff 
typically reflect program specialties, primarily In the field of 
education (e.g., vocational education and learning disabilities). 
There Is a striking absence of research training, let alone actual 
courses in evaluation. Since most are certified teachers some 
exposure to testing and measurement was acquired during the under- 
graduate years, but the time lapse between this training and the present 



ERIC 



135 



4-28 



IS likaly to have been substmitial and the content insufficient for evalu« 
ation tasks. This should not suggest that these Individuals are profession- 
ally incompetent-»many are active menbers of such organizations as NASDE and 
subscribe to such educational publications as Education Daily . The lack of 
valuation trainins merely reflects the fact that these tadivlduals were not 
hired for their evaluation eKpertise and most likely did not expect to have 
this Included under their Job responsibilities. 

Tectalcal assistance that is easily accessible to the LEA can enhance 
capability. For example, Title I programs relying on staff for evaluation 
have utilised the Technical Assistance Centers (TACs) to assist thra in 
satisfying fede.-al evaluation requirements. As the NGES survey reported 
by Goor Indicated, In 1978-79 the TACs were called upon to^ provide rudi^ 
mentary Infomatlon concerning student selection^ test selection^ use and 
interpretation of normal curve equivalent scores, and proper preparation of 
reports. More recently, assistance In bettei using evaluation data and 
eKpandlng evaluation activities has been offered to interested districts. 
State education agencies can also be instrumental in encouragljig the use of 
the TACs and some have supplemoited these services to LEAs who request addi- 
tional aaslstancei 

However, programs other than Title I may not have such readily accessible 
evaluation expertise at their disposal. Technical assistance in some of these 
programs still ranalns the responsibility of the SEA and concentrates more on 
guidance in developing and Implfflnenting programs. Onm exception to this was 

J provides its own evaluation assistance through workshops and 

individualized services for administrators, program specialists, and others. 
However, this has tended to focus on evaluation In general or on specific 
state programs rather than federal programs. Other resources have been used 
such as Title VII managCTient workshops and state conferences. At the same 
time it must be remembered that these workshops can only include so much in 

t t^V' I" addition, school dilendars may not always coincide with workshop 
schedules, and small LEAs who may most need the assistance may not have the 
interest or the resources to attend. 

When funds are available to support full-time evaluator positions, staff 
characteristics tend to more reflect research-oriented training. For example 
xn our site visits several of the full=tme Title I evaluators within the pro- 
grams had doctorates in specialties Involving at least some exposure to re- 
search (e.g., science or math education). Consequently, these Individuals 
may be better prepared to fulfill evaluation reporting requirements and perhaps 
even pursue some small-scale additional efforts, given that outside assistance 
tor specific design and analysis problems la readily available. 

_ Again, what is problemmatlc are those instances where high quality tech- 
nical assistance in evaluation is necessary but not readily accessible. When 
there are evaluation tasks to complete which require some research skill but 
there is (1) no district evaluation unit to contact. (2) no formal assistance 
mechanism at the SEA or one which is understaffed, (3) no encouragement by 
the SEA to utilize federal technical assistance resources, and/or (4) no 
nearby interested university to call upon, program staff are left with little 
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opportunity to competently execute federal evaluation requlranents. For exam- 
pJ.e, in Site 

The SEA itself was understaffed and until recently had not encouraged its dis- 
tricts to do more than comply, m fact, it was perceived that political ten= 
sions were consuming much of the efforts of the SIA. Although the district 
undersI;f?.H^^""! research unit responsible for avaluatlon, it was severel 
^^fr u ' ^f^''^- specific evaluation expertise, and had not developed a 

fn?^ ;f "lationship With the programs. There was funding available for 
frL hit^fn i I ^valuators, but the Board had prevented these vacancies 
from being filled. The local universities did not include evaluation In their 
thf^Ar^ ^^"-^ --r degree programs. While the Title I program staff did use 
trLt fd^.-nf r 'Vi ''-^ optimally exploit this expertise, given that the dls= 
trict administration and evaluation unit were fairly uncooperative In helping 
them recognize where evaluation could fit into program decisions. Title VII 
staff, on the other hand, had done some reading in evaluation and knew the 
types of information that evaluation might provide, but could not benefit 
from such federally provided technical assistance as their Title I counter- 
parts. The two thousand dollars which had been available for Title VII evalu- 
acr«L*'^H 1 competent outside contractor. The results was that 

anfanv^LJ^f f ff-- evaluation requirements were marginally met, 
and any additional efforts were few and suffered in quality. 

Cases like this suggest that if the federal government requires evaluation and 

Dllshlnr^h^ information to be provided, it must support districts In accom- 
pj-isning these goals, ■ ^ 

When evaluation r esponsibilities are assig ned to a researe h and evaluation 
mik- Some information is available on the capabilities of those individuals 
employed by evaluation units. The UCLA study found that most Directors of Re- 
search and Evaluation report having earned doctorates (76%), primarily in the 
tl^t "i^rf ^^i""tration. While 86% of these individuals claim to 

acldii? "'"f^t^d « one. course in evaluation as part of their formal 

academic training, the quality of these courses can vary. A minority of 14% 
aclmowledge any specialization m research and statistics. 

^..^J^^ l^^f"^ " characteristics of professional staff m general, a 
Itlff Ar^^^ ''^-.r- included in the UCLA research. BSSR found that most 
statt tia/o m small local education agencies (i.e.. 10,000-49,999 enrollment) 
who were involved in research, development, dissemination or evaluation list 
education as their major area of expertise. Fewer than 10% of the staff of 
one hJfr%^%\°"^ statistics as their primary specialty areaT Approximately 
one-half of these agencies report that 25% or less of their staff have doc- 
torates. In terms of large districts with enrollments of 50,000 or over 
a greater percentage of the staff in these agencies have their doctorates. 
There is also a tendency to have higher percentages of staff within LMs 
whose primary expertise is In math and statistics. 

Webster and Stufflebeam. in their examination of evaluation competencies 
in local school districts with evaluation units, found that urban school dis- 
tricts expect a high degree of technical proficiency from their staff. For' 
example, evaluation methodology and experimental design were consistently 

i^P°rtant competencies. In addition, large urban 
school district evaluation directors also wanted such skills as instrument 
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development^ multivariate inferential statistics, and computer applications 
In their staff as a whole. Based on job announcements obtained In our site 
visits J we also found that urban districts who engage in a variety of evalu^ 
at ion efforts asked more of their applicants to terms of formal acadmic 
training in evaluation and teclmlcal expertise. 

The less compliance-orimted a district is, the more a wider range of 
skills is needed within the evaluation units although no one individual must 
possess all capabilities. The competencies of the staff typically are de- 
fined by the tasks which must be performed. In most cases, the level of 
skills eKisting among the staff match the level of expertise required by the 
evaluation efforts, but occasionally there are individuals who are totally 
unskilled In evaluation. For example, in Site the Superintendent allowed 
the evaluation imit to become a "dimping ground" for "people whom the Super- 
tot endent did not know how to handle or felt were too dumb to do anything 
else**' If the capabilities are too disparate, problems result—either to 
poor quality evaluations due to the lack of appropriate skills or dissatls^ 
faction and evmitually staff turnover due to the evaluators being too highly 
tratoed for the tasks required. Inadequate hlrtog policies, lack of profes- 
sional Incentives^ and political obstruction of evaluation efforts contrib- 
ute to this state of affairs. 



Matching capabilities to assigned tasks . Throughout this chapter it 
has been maintained that the capabilities required for evaluation are highly 
dependent on the evaluation tasks assigned. Staple evaluation reporttog 
to federal/state agencies does not require the same types of competencies 
as does engagtog in discretionary efforts which examine additional evalu- 
ation questions. These latter endeavors also range across districts with 
regards to their scope and complexity— from gathering information on cli- 
ent satisfaction to estimating the Impact of the program and its various 
components. As the sophistication of these activities Increases, so does 
the level of expertise necessary for their successful execution. Table 2 
presents Infomation, based on a review of annual final reports^ for the 
Title I evaluation activities occurring in two local school districts. For 
both sites, the activities required by the SIA are outlined, along with 
efforts Initiated by the districts thmselves. The discretionary efforts 
described for Site I are representative of those perfomad by most LEAs 
which go beyond sheer compliance with federal/state reporting require- 
ments. The profile of evaluation activities occurring in Site II reflects 
those conducted by the small minority of LEAs which engage in mora ad- 
vanced evaluation practices. 

As can be seen from Table 2, obtaining evaluation information for SEA 
reporting does not require advanced technical and research skills. Simple 
haadcounts, calculation of m^ans and ratios* and conversion of raw test 
scores to standardlEed scores constitute the predomtoant mode of analysis, 
Evaluators in Site II must also send to the raw data on which these figures 
were derived. Ways to collect , compute, and report the data are explicitly 
outlined In Title I evaluation and reporting manuals and further explained 
by TAG and SEA staff* Some sense of record-keeping and data managCTent, 
along with an appreciation for accurate toformation, is warranted so as to 
ensure the accuracy of the data. Rudtoentary exposure to correct testing 
procedures Is also necessary, but much of this is explained In the accom- 
panying manuals. It should be noted * however, that data concerning the 
prevalence of reporting errors suggest that there is much room for enhance- 
ment of the basic skills required. ion 



Table 2 

Title I Prograii Evaluation Activitlis in IVo Local School Diitricta 



PurpQSi 



Prograi 
Dejcription 



tlis Reflulrgd by 
the SEA* 



-Obtilning counti of the 
nuaiber of studints 
Sirved and the nunbir 
eliglbli but not served 

(I) . 

-Obtilning counts of the 
nu^ir of I particlpanti 
by gride, suppDrtlng 
servici, and race; par- 
ent' participants in var- 
ious prograi activities 
(li 11). 

-Obtaining counts of the 
Queers of staff ir 
ployed in various capa- 
cities (I), 

-Obtaining counts of the 
nuier of itudgnts in 
diffirtnt types of pro- 
gram ,(e,g,, pull-out) 
for Grades 2, 6, and 10 

(II) . - 

-Obtainins data on the 
tlis divotid to instruc- 
tion (ainutes or hours 
per Wiik) and Instructor 
par studint ratio for 
Gradis 2, 6, and 10 
(I. 11). 

-Indicating subject area 
and length of prograi 
(I. II). 



Activttles Initiated hi the DistriM 
Mii I Site II 



-Priifintipi inforMtion on the 
nuiier of students servid and 
thi nuier eligible but not 
served by individual echools. 

-Developini a narratlvi of pro- 
gram charactiristlcs. 

-Priiintlng data on the nrtar 
of Instructionil staff by in- 
dividuil schools, 

-Deslplng and adiinlstratin| a 
survey liseislni Pirent Ad- 
visory Council fflinbir satis- 
faction. 



■Priiintlng Inforaation on the nmi- 
ber of students .served and the nur 
ber iligible but not served for 
various Instructional coBponents. 
■Developing a narratlvi of prograffl 
characteriatics, 

'Presenting Inforiation on the nun- 
bir of students served by various 
deiographlc characteristics (e.g,, 
iix). 



I 

I- 



ERIC 



Tibia 2 (Continued) 
Titli I froirai Evaluation Activitiie in Two Local School Districts 



jurpose 



ti|j Reiulrad by 



ExaEination of 
Mmiu Efferti 



^t^iis Mtlat|d b£ th| Di|trict 
Site I siti II 



Reading and Math 
Achigvefflent 



-.ERIC 



■^Obtaining couDts of the 
nuabit of insirvice 
training siasions held 
d, II). 

-Obtainlni comti of the 
Eiu^er of partieipints 
by instructional cipa- 
city {i,g,j aids) and 
asilgniiiint to Title I 
or non Titli 1 prograis 
(I, II). 



^Disiinlng a slopli insarvice 
training evaluation Instruiant 
to discern whether participants 
"likad" or "diilikad" the 
triining seigions, Analysis 
focusid on sl^le fraquency 
counts for each itei In the in- 
itrument. 

'Designing a neids aisessfflant 
instruflent for one Inservlci 
triining session to determini 
what staff wintad froi the 
triining moduli, toalysis 
focused on sl^li fraquincy 
i counts for eich itia included 
in the questlonniiri, 



'Obtaining counts of the 
nmber of students with 
both pritiit mi post- 
test data (I, 11). 
-Indicating tistlng in- 
terval or dates (I, 11). 



T- 



■Disiinlni and adiinistering a 
survey to Title I mi clasi- 
rooi teachers regirdlng their 
parceptlons of the pro|raiBi 
toilysls focuiid on frequincy 
counts for survey Itiis. ■ 



-Dislplng and conducting in ivilui* 
tion of the Insarvici Coif onent, 
Methods Included the Mlntinance of 
i detailed log of Iniervice offer- 
ings, structurad pirticipant inter- 
viiwe concernini aspicts of the 
tralnini they receividj the adiin- 
ist ration of an adaptid itandard- 
iiad instruient for leaiuring sat- 
isfaction with the trilningi and 
the collection of cost dati. An- 
alyses co^ired litpected vs. ob" 
served use of the insirvici center 
for the district as a whole and 
for staff in various Inscructional 
capicities, Average aatief action 
ratings with the iniirvici traiiilng 
for the district wen cospired with 
city findings and national norms 
derlvid by the publlshirs of the 
standardliad Instrument. Satis^ 
faction ratings ware alag co^ared 
anoni' individual prisenteri. Cost 
briikdowns were provldid pip day, 
per partlelpantj and per type of 
trainlni and Iniarvlee activity, 



I 

y 



■Coiparlng achlivaient tist scores 
for Title 1 students with thi teit 
scores for studenti in a coaparlson 
groi| (this groupi consistid of stu- 
dints who were elliibli for Title I ^ . - 
but who were not served). 142 



Table 2 (Continued) 
Titli I Prograi Evalualion Aetivitiii in Two Locil School Dli trices 



Purpoie 



.Actlyltlte Required by 
the SEA* 



Activities Initiated the Ma 
IM 1 Site II 



teading and Mach 



Achleveaienc 



-Indicating the pretsst 
and poittest non dates 
(staniard or non-standard) 5I 
iivsla, subteit area, and 
forn (iaii for pre- and 
posttists) used (II), 
-Prssentation of individ- 
ual student icoris in 
percentiles (H), 
■Calculating the NCI pre- 
test laeMi NCI posttest 
nuan, NGE gain (NCE pre- 
tiit mean liiEracted froi 
poittest mean), and 
wel|hted NCE gain (NCE 
§lln X nu^er pre and 
posttasted/nuiber pre 
and posttistid)' by 
grade (I, II). 
-Calculating saie stat- 
istics presentid above 
for individual projecti 
in Grades 2, 6, and 10 
(I). 

-Obtiining counts of the 
nui^er of studints who 
Inprevid in loathj riid- 
ing, work habits and 
bihavior adjuitoent as 
assissid by a 3-polnt 
stindardlEed rating 
scale administered by 
tiichers at the pre- and 
post-tisting dates (I). 



-Diiigning and adniniitering i 
survey to a random sanilt of 
1,500 Title I parents regard- 
ing their perceptions of the 
program's benifits. Frequen- 
clss of the "yes" and "no" 
rasponsei were reported , 
*Analyiln| the achiiVBient 
test data by grade in terms 
of the mean prttest scores 
the mean posttist score, the 
lean NCE gain, the average 
amount of time spent In in- 
struction , and the oean NCE 
gain per month. 
'Riporting the achleviDjsnt or 
lack thereof for perforiance- 
basid objectives (e.g., 651 
of studinti in Grade 1 will 
gain at least 6 and not more 
than 10 NCE's). 
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fliesi coBparisons coniisted of ex- 
amining the mean pretest scores, 
posttent scores, and NCE gain. 
Pre- and post-percentile distribu- 
tions were also plotted. 
Conparini achiaviment test scores 
for Ehoifi students given another 
year of Title I servicis although 
they no longer qualified because 
spring testing scores were above 
the cutoff critirion with test 
scores of sMUt students who were 
not given another year of service. 
Gomparisons included sane strat- 
egles described above. 
-Coaparing aflhlivenint test scores 
for students receiving various In- 
structional strateglis in Title I 
progriis, Gonparisons were also 
performed for thise groups with a 
"no treitaant" group, Results 
were analyEid in terms of the mean 
pretest, posttest, and NCE gain 
icoiis and percintile gains, An 
analysis of covarianee was also 
attempted but judged inipproprlate 
due to extreii dlffirences across 
the various ins true tionil groupi 
for pretest means. 

-Conductlni an m post facto analyiis 
of ichieveiint £or Title I students 
and those ill|ible studints who were 
not served, Gridi-iqulvalent iians 
were presentid, t-tests were per- 
foMid, and a lultipls regreesion 

■ : , U4 



Table 2 (Continuid) 
title I Program Ifaluition Activities in I^o Loeal Sehool Diitrlcts 



Purpose 



leading and Math 
Achieveisnt 
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Acti^ties Required by 
the m 



-PriienElng infoniition 
on Title I propais 
which do not e^loy 
the nori-rifirincid 
iodil (i,g,j discrip- 
tion of objictivei and 
lisissiint prociduris 
and judpent of whither 
or not objectivis mm 
achlivid) (II). 



ActlTMp Initiated by the 
Bite I Site 11 



WIS used to co^iri isrvsd and 
nonsirved students while eontrol- 
ling for IQ, prescorej and sex. 
Sradi iquivalint iveragis for stU" 
dants were alio plettid igiinst 
nitionil nornis ini chance averagis. 
Additional rigresiloni were ilio 
perforsed which exailned the effect 
of years of treatient and the 
trsitiint siquinci on achliviment, 
^Deiiining and conducting a iithod-^ 
ological study to provide basiline 
data coffiparing reading pretest |. 
scores and gain scgrea for Title i 
I studints and a comparison group < 
Statistical procedures used to 
coi^ari pretest scores to gain 
icoris were Pearson product moiint 
correlation coefficlenti and analy- 
sis of variance and Duncan's New 
Multiple Range Tests, 
-Co^aring the extent of lipliiinta- 
tion for 5 title I instructional 
approaches, Hithods iiployid in- 
cluded classrooi observitlons and 
structured intirviews with tiacherSi 
Analyiii involved coiparlsons of ■ 
the costs per pupllj attindance 
levels, Inplenentation ratings p use 
of support sirviciSj and achlevi- 
itnt giins icross the 5 stratiglei 
of instruction, ^ , ^ 

m 



T*''^ ^i5"-n nmerils prisented after each activity in this columi rifir to the SEA rsquireinints for Site I and Site 11. 
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The discretionary activities performed by evaluation staff in Site 1 
require some research background and training. Knowledge of the rudimentary 
principles of sampling, survey design, and questionnaire developmait is Im- 
portant. A review of the instruments and evaluation practices conducted by 
this site does not suggest the presence of highly advanced research com- 
petencies. For example, rating scales were often limited to the dichotomous 
categories of "like" and "dislike and analyses never went beyond the sim- 
ple reporting of response counts for questionnaire categories. Concerning 
achlevCTent data, analyses focused on simple descriptive statistics and did 
not attmpt to Incorporate possible comparison groups or additional factors 
which might Interact with the program. The execution of these activities 
could possibly be Improved by increased technical expertise. However, it 
is debatable wheuher sophisticated research training at the doctoral level 
is warranted, given the types of tasks conducted by this evaluation unit. 
In addition to technical competency, some skill at managing a number of on- 
going research efforts Is required. 

The activities conducted by Site XI and Its evaluation unit Involve 
relatively advanced technical exFertlse. Knowledge of both descriptive 
and inferential statistics and their limitations is mandatory, and training 
in such methodologies as correlation/regression strategies Is required. A 
background In a variety of research practices-- interviewing techniques, ob- 
servational strategies. Instrument development, and evaluation design—is 
essential. Expertise in managmient of large data bases and coordination of 
a number of evaluation efforts is essential, at least for a few of the in- 
dividuals in the unit. In addition, unlike Site I, this district assumes 
the responsibility for formulating recommendations in its evaluation reports 
which may suggest that some background in educational theory and practice 
is desirable. 

Illustrat ive case studies^ Selected case studies are also presented 
so as to better depict the types of evaluator skills and characteristics 
in LEA evaluation units and the extent to which these match the tasks re- 
quired. Factors which can erode existing capabilities are described. In 
the first two case studies, district enrollments are similar, as are many 
of the evaluation efforts conducted by the evaluation units. The dispar- 
ities lie in the unique organizational and adminlatratlve support experi- 
enced by the district portrayed in the first case study. The third case 
study, while based on a district with approximately twice the enrollment, 
IS an example of a "compliance-oriented" district which suffers from a 
number of constraints. It is contrasted with another district of similar 
size which engages in a variety of sophisticated evaluation activities and 
which enjoys considerable local visibility and support. 

Casel^ In Site B, the research and evaluation office has been in operation 
for almost a decade and is highly visible within the%rganiEation . It is 
administratively Independent from the programs but does not experience fis- 
cal independence, it is supplied with greater than average resources in 
terras of funding and allocated staff positions. The budget is 0.6% of the 
total operating budget of the district, and there are 10 full-time prof es- 
M^o^i^^f. ? 1*^"' ^'^f """S the Director, and two part-time positions. 
Mmost half of the unit's budget is derived from federal sources, with one-^ 
quarter each being provided by the district and the state. All of the evalu- 
ation positions, with the exception of the Director's, are at least nartiallv 
supported by federal funds. Evaluation is awarded an'unusually high-level of 
sponsorship by the Superintendent and his Assistant. In fact, the Assistant 
ta^hrl™^"^ advanced evaluation training and substantial experience . . 
in the area, having formerly served as the Director of the Evaluation Unit 14 / 
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The unit's activities Include state and city wide testing, avaluation 
of federal, state, and district programi^ provision of evaluation training 
to district personnel j conduct of special research projects^ monitoring of 
outside contractors when they are utilized, and general technical assis- 
tance to progrffln and bui' ing staff* In addition to complying with federal 
evaluation requironento ^ -hey perform such discretionary studies as longi- 
tudinal analyses of achi^vCTient and examination of the problems aisociated 
with evaluating programs where children receive funding from multiple 
sources. The unit Is responsible for the required evaluation activities 
for Titles 1^ VII^ and IV-C and for Special and Vocational Iducatlon pro- 
grams funded through direct grants. 

In the unit there are gradations of evaluation positions in terms of 
responsibility and skills—1 Director, 1 Evaluation Specialist, 5 Evalua- 
tion Assistants, and 3 Evaluation Technicians, ^though all staff parti- 
cipate in all tasks (I.e., from specification of objectives to formulating 
recommendations), the extent of their Involvment varies, depending on 
their capabllltlea. Staff typically work on evaluation activities for a 
number of different programs. 

The capabilities of the office as a whole match the tasks conducted* 
Given that assistance to programs and reporting constmie a major portion 
of staff tdjne> Interpersonal skills are ranked highly^ In addition to the 
ability to write clearly. Speelallged technical skills are perceived to 
be required in only a few staff who share responsibility for special re- 
search projects. No more than two staff have such proficiencyp and TACs 
typically provide sophisticated technical expertise for individual projects. 
General technical skills focusing on testing applications^ basic descrip- 
tive statistics, and data processing are necessary for all Specialists 
and Aasistants. 

Admin iat rat ion and management of research rests with the Director who 
has earned a doctorate In education. Originally hired as an Evaluation 
Assistant, the Director has advanced in the organization ^ acquiring sub- 
stantial research and evaluation experience in the local school district. 
Time allocations for this position arei 50% for managment of evaluation 
activities^ 20% for conducttag evaluations, 15% for such policy-related 
activities as attending Board meetings to present evaluation findings, and 
the raaalnder on program development and general administration. Coor- 
dination of evaluation activities rests with the one Evaluation Specialist 
who devotes 70% of her time to this effort. This person has a doctorate 
in educational research - 

Evaluation Assistants primarily divide their time between conducting 
evaluations and providing teclmical assistance. This position requires the 
ability to design, toplment, and complete evaluations, along with managing 
the team of staff assembled for the project. Knowledge of norm- and criterion- 
referenced tests, descriptive statistics, and computer applications is re-- 
quired. These individuals all have their master's degrees, primarily in educa- 
tional research or the social sciences. Technicians are used for the gatherings 
analysis, and Interpretation of data and are required to have some background 
in or willingness to learn data processing and analysis strategies. These in- 
dividuals are currently earning baccalaureates in the social sciences* 
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In this unit there Is a high degree of professional activity and staff 
development. The miilority of staff present papers at professional meetings 
on testing and evaluation issues, and the Director Is extremely active in 
this area. The unit has applied for and received a NIE grant for research 
related to objective-based testing in educational programs,. In addition 
it has sponsored a Joint M.A. program in educational evaluation with a near- 
by university. This program was designed to train students for evaluation 
positions in school districts and SEAs, and many of the staff participated. 

There is no problem with recruitment since a teaching credential is 
not required, and nearby universities with evaluation programs often pro- 
vide interested applicants. In fact, the only problem mentioned was low 
salaries which tended to result In competent individuals being lured into 
more lucrative positions in other sectors. 

^B&2. The evaluation unit In Site C was established in 1973. One half 
of its resources are devoted to evaluation activities, with three-quarters 
of this being targeted at evaluations of federal programs. Its responsi- 
bilities include state and city wide testing and evaluation of federal and 
state programs (Title I, Vocational Education, and Special Education). 
While the unit does report directly to the Superintendent and receives some 
district funding, it does not enjoy the level of support from the top ad- 
ministration as that experienced by Site B. Evaluation Is an interest of 
the Superintendent only inasmuch as it is one requirement attached to the 
receipt of federal funds. This lack of interest results in the occasional 
undermining of the unit's authority and credibility and restriction of the 
range of activities which can be undertaken. 

In addition to complying with federal/state evaluation requirements 
other efforts are conducted for the purpose of Improving programs. For ex- 
ample, objective-based evaluation Is performed. Involving goal specifica- 
tion, monitoring of the program, data collection, and assessment to ascer- 
tain achievement of goals. Supplanentary research on such Issues as match- 
ing teacher and learning styles has been conducted. 

Including the Director, there are 6 full-time professionals with 3 
of these being directly Involved in evaluating federal programs in the dis- 
trict. These three individuals have earned their doctorates— two in meas- 
urement and evaluation and one in educational administration. The research- 
oriented degrees reflect extensive training in research and statistical 
skills. The majority of the staff have also had experience in- -research 
and evaluation. In fact, technical and evaluation skills was one criterion 
for hiring the Director who was recruited at a national conference. The 
Director is responsible for managing the evaluation efforts In the district 
and conducting evaluation studies. One of the evaluation staff is hired 
exclusively for Title 1 evaluation, and the other Individual is assigned 
to evaluation activities associated with Special and Vocational Education 
programs . 

The problans In this unit do not stem from a lack of training and skills. 
If anything, the technical sophistication of the staff is underutilized 
Activity in professional organizations Is comnon. and some of the staff 'have 
presented papers at conventions. It is the organizational constraints 
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which prevent the evaluators froTn fully exercising their skills. Unlike 
Site B, where fiscal dependance of evaluation upon programs allows the 
flexibility for obtaining additional funds for supplementary efforts which 
improve the quality of evaluation, in Site G such fiscal dapendence is 
problematic. The primary reason for the difference between these two 
sites involves the personalities of the individuals who hold the purse- 
strings. While the Director of Federal Programs in Site B actively pro= 
motes evaluation, his counterpart In Site G seeks to undermine the process. 
What has resulted is the inability to conduct desired evaluation activities 
for federal programs and even the loss of one highly trained evaluator. 
For example, the evaluation unit has been discouraged from hiring an external 
auditor to review their Title I evaluation report* The Director would like 
to research aspejts of the Title I evaluation models in the district but 
can not pry loose the funds to do so from the Director of Federal Programs. 
In addition, staff development is inhibited as expenditures for attendance 
at professional conferences and TAG workshops must be first approved by 
the pursestring holder. These problms are not facilitated by the apathy 
of the Superintendent and have played a role in the resignation of the 
Title 1 evaluator~an individual with a doctorate in evaluation . The 
Director of Research and Evaluation has decided not to fill this vacancy 
with such a highly trained individual as it will only result in the under- 
utilization of evaluation skills. 



<^se 3 . In this district, almost twice the size of those described in the 
first two case studies, the evaluation unit was established in the early 
1970's, It has a budget which is approximately 0.1% of the total district's 
operating budget. It does not report directly to the Superintendent, ex- 
cept on an "informal" basis ^ and is not administratively independent of 
the programs. It receives minlinal supvjort from the policy-making bodies 
in the district who are constantly embroiled in political squabbles. 

The primary efforts of this one-person unit are devoted to systm-wide 
testing and federal program evaluation. Although there are a variety of 
federal programs operating in the district, the Director is only responsi- 
ble for testing and Title I evaluation reporting. Technical assistance is 
erratically provided to program staffs and essentially activities center 
around compliance. 

The possible reasons for this are numerous. First, the unit is dras- 
tically understaffed, given the district enrollment and the number and size 
of federal programs with evaluation requiremmts operating in the district* 
The probability of acquiring staff for the two Title I funded evaluator 
vacancies is slim. In factj it was stated that it is easier to "obtain new 
computer equipment than additional evaluators." Much of this is due to 
local politics, and the Superintendent and the Board have malinge id over 
approving the staffing of even one of these positions. 

Second, the background of the Director is in the areas of business and 
computer processing. Consequently, more of his time is devoted to activ- 
ities which coincide with these interests— testing and computer programming 
to score and analyze tests—rather than to evaluation* Additional staff 
with evaluation skills would certainly be beneficial as they would relieve 
the Director from the evaluation responsibilities which he does not enjoy. 
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However, it la difficult to find qualified applicants even if positions were 
available. Salaries are low, teacher credentlalling is required, and former 
employment in the district is an implicit qualification. Once again, politics 
only exacerbate the situation by subtly persuadlni hiring committees to en- 
dorse favorite sons" of the Superintendent who do not necessarily have evalu- 
ation qualifications. 



Case 4 , 



In Site E the evaluation component is administratively distinct and 



reports to the Deputy of Management. It has a hefty budget— almost 0.7% of 
Che total district's operating budget. Its responsibilities include the admin- 
istration of all systemwide testing (located within a subunit of this division) 
and the evaluation of state, district, and federal programs. The evaluation 
unit has primary responsibility for all evaluations connected with federal 
programs, with the exception of Vocational Education. It enjoys considerable 
administrative support. In fact, the Board recently cr ated six additional 
full-time evaluation positions to specifically conduct Board-requested evalu- 
ations. 

The evaluation unit Itself views its role as "technical assistance" in 
the broadest sense. Evaluators help draft objectives, train teachers and 
other staff in evaluation, explain test scores, and design evaluation com- 
ponents. A number of special studies have been conducted which focus on 
such areas as Title I evaluation procedures and evaluating specific proBram 
components. 

At present, there are 14 full-time and 6 part-time evaluation professionals 
in the unit. In addition to the Director, there are 8 Divisional Assistants 
and 6 Administrative Assistants I and II. Divisional Assistants are responsi- 
ble for designing, implementing, and producing the final evaluation product 
and require at least a master's degree with related work experience and exten- 
sive course work in research and evaluation. For Administrative Assistants" 
requirements are a master's degree or related experience and evidence of a quan- 
titative aptitude. These Individuals help In design and analysis and collect 

^ ^"^^ spectrum of expertise within the unit. More than half 
of the Divisional Assistants have their doctorates, and some are presently 
candidates for this degree. Two of the staff have their doctorates in met hod - 

H-^S^L"^^ (testing and statistics). There are also an anthropologist 
and child psychologist within the unit. Previous work experience has ranged 
from evaluation research in a private firm to staffing duties in a state leg- 
islature. Former teachers and principals primarily occupy Administrative 
Assistant positions. The unit has utilized doctoral candidates from nearby 
universities, and these are usually individuals who fill the part-time posi- 
tions. The unit is exploring the possibility of establishing a formal in- 
ternship program in evaluation with a local university. 

h^irf development activities are coimnon. Once a week staff meetings are 

held where individuals report on their current activities, present reports of 
conferences attended, and distribute relevant papers and articles " 



ERIC 



151 



4-^40 



Imp-llcatlons of these case studies * These case studies should highlight 
some salient factors. Firsts capabilitias must be judged within the context 
of the avaluation tasks assigned and the resources committed to the process* 
Simple compliance with federal/state requirements, if technical assistance is 
available, does not necessitate extensive evaluation training. It is when 
districts attempt to go beyond federal requirements that additional skills 
are required, and the level of expertise is related to the level of sophistica- 
tion defined by the evaluation. However^ this expertise should be viewed In 
terms of the unit as a whole rather than each and every staff member. 

Secondly, the existence of an evaluation unit helps to ensure that at 
least some coordination of evaluation efforts occurs, that they are conducted 
more efficiently, and that some level of technical assistance can be made 
available to programs* It also tends to recruit, attract, and select indlvidua: 
with at least some research training. In capable evaluation units the picture 
is one of "goodness of fit'* between capabilities of the unit as a whole and the 
evaluation tasks* There usually is an adequate amount of technical competency ^ 
with one or two Individuals being trained in evaluation design ^ method 5 and 
statistics. Outside expertise such as consultants and TACs are used to help 
solve specific methndological problOTS* Other staff usually have the necessary 
abilities to assiRC in carrying out research activities* Many of the staff 
may have served as former teachers , a not too surprising occurrencep given that 
certification can be an explicit or i^ipllcit hiring requirement* This teaching 
experience was often viewed as beneficial by the staff interviewed, especially 
for providing technical assistance* It helped in not only imderstanding the 
program but in being better received by program staff and teachers. 

However^ what is important to recognize are the various ways In which this 
match can be weakened* Adequate resources must be channeled to the evaluation 
unit to hire staff and support some research costs. In districts which have 
additional resources^ the staff tends to be composed of a greater proportion 
of individuals trained In fields related to education research and evaluation. 
For example, in Site E (which had a hefty 0*7% — ^ITSOjOOO-of the total district 
budget)^ there were 14 full-time prof esslonals 5 over half with doctorates in 
educational and social science research* In addition^ compare this site to 
Site C which had the same number of students and types of federal programs 
but only one evaluation position. 

Coupled with this is the issue that many of these professional positions 
are financed by federal funds. The comment was often made that* regardless 
of administrative support ^ if the federal programs and their evaluation re= 
quir^ents were eliminated ^ districts could not evaluate these programs or 
could only do so at a reduced level. One reason for this Is the financial 
problems suffered In several LEAs at this time. Consequently , the advent of 
avaluation requirements has provided monies not only to hire evaluators who, 
if nothing more, have to respond to federal requirements for information but 
also to hire evaluators who are highly trained and engaging ^'n many worthwhile 
and high-quality efforts. 
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Organizational problems can prevent districts from Improving their evalu- 
ation practices of federal programs. For example, in some cases the inability 
to go outside the district to attract specific expertise and/or to hire in- 
dividuals without teaching credentials limits the capabilities of the evalu- 
ation unit and may equip it wlfh Individuals who have few evaluation skills 
and little interest in evaluation as a profession. However, in the majority 
of distrlcus visited, this was not the most Important problem. In fact, in 
some districts credentialling as a requirenent prevented evaluator^s salaries 
from plummeting to those offered by civil service. Other districts circumvent 
these possible restrictions, as did Site B, by creating evaluator positions 
which do not fall under the requirements typical for administrative positions* 

Given the nature of required evaluation responsibilities , highly advanced 
training, such as a doctorate in evaluation or methodology, is not necessary 
for competent execution of these tasks. At the same time, however, it should 
not serve as a deterrent to engaging in local district evaluation and is re^ 
quired for many discretionary evaluation efforts. In our site visits, com^ 
plaJjits were voiced by research-trained doctorates that often their technical 
skills were underutiliEed, Some were considering their position as temporary 
until something better came along" which allowed greater latitude and cre- 
ativity. Professional freedom and incentives for evaluation activity need to 
be instituted so as to prevent units from losing competent staff and to en-^ 
courage trained individuals to participate in activities which will Improve 
the nature of local evaluation practices. 

At present, there exist few avenues for district evaluators who are capa- 
ble and eager to conduct evaluation research and improve the state of the art 
and educational programs. As Webster and StufflebMm have noted, federal 
funds have generally not proved beneficial in assisting local school districts 
in answering questions beyond the ones generated from required efforts. 

"The Gvcracc cvnlir^frton dcpfl.rtF'ent dnesn't: have the time to play the 
funding game when, perhaps one in ten proposals are funded. School 
districts cannot be expected to even write a proposal if the chances 
of funding are not at least seven in ten. Thus, the dlllema con- 
tinues, Basic research funds continue to be channeled to univer- 
sities and research and development agencies not brought to bear 
on crucial problems in environments where the importance of the 
dilenmia is fully understood and experienced dally,'- 

The ability of LEAs to obtain grants through the competitive grant process 
as it presently exists was demonstrated by Site However, this proposal 
was developed on off-duty time, and it cannot be expeGted tlmt all evalua- 
tion units can respond in this way. What this does indicate is that there 
are LEA evaluation units which can design and conduct quality research. 
These units should be given opportunities to do so. 
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Universities should also devote some thought to developing mutual pro- 
grams with state and local education agencies. These programs can improve 
the capabilities in education agencies and also provide a training ground 
for graduate students desiring to enter the profession. One conment made 
by an evaluator with a doctorate in evaluation research concerned the fact 
that graduate training did not prepare her to deal with the bureaucracy 
of local schools. Although she did complete an internship while in the 
program^ devoting one day a week for one quarter is Insufficient* The es- 
tablishment of these university and education agency relationships might 
be fostered by professional organizations (see Prentiss^ 1980) or by fed- 
eral agencies who award training grants or fund workshops to develop evalu- 
ation skills. These sustaining relationships may prove more beneficial than 
the short exposui^as typically provided by 2-3 day workshops* 

The concept of an "endowed chair" or "postdoctGral position" might also 
be employed in state and local education agencies. For examples one of the 
staff on this project took a leave of absence from the univerlsty to become 
the Director of Research and Evaluation for a large urban school district* 
In this way J university professors not only obtain an opportunity to apply 
their skills, but the agency also benefits from their expertise. Even af- 
ter an individual's departurej the procedures and ideas rCTttain, These types 
of positions may be able to provide skilled expertise when hiring restric- 
tions prevent districts from recruiting applicants for permanent positions 
from outside the district or who lack teaching credentials. Funds might be 
provided to assist in contributing to the salaries of these individuals if 
district salary levels are insufficient* 



4.6 THE CAPABILITIES OF OUTSIDE CONTRACTORS 

Given that the use of outside contractors for evaluation and research 
activities has generated debate, this section specifically focuses on the 
contractual arrangement* However, it must be made clear that time con-- 
straints have limited our ability to comprehensively explore salient is- 
sues. At this timSj we can only present some general observations, based 
on our site visits and interviews^ and indicate possible areas ' h may 
warrant further consideration* 



The Capabilit ies of Outside Contractors Bnployed by F ederal Agencies 

Problems associated with the use of outside contractors by this sector 
do not appear to primarily concern inadequate capabilities of contractors 
and monitors. Based on an examination of contracts and interviews with 
these individuals s we reiterate the conclusion reached by Berryman and 
Glennan in their analysis of federal educational evaluations. In short, 
'',.,few if any unsatisfactory evaluations of federal education programs 
can be attributed to incompetence or bias of those who fund or conduct 
them, " 
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-^u ^^/^ fairly obvious that the evaluation contracts awa. led by OED and 

avJlLtPd"?."""' ^''"f '"^^ '"^^"^ ^ '""^ °f P"i"™ area" being 
ritwf '-f /"S? °- '''^ questions to be addressed, and tha 

casks specified. Often overlooked in discussiona of contractual practices 
ar^ the range of skills and amounc of effort needed to mount and execute many 
of these studies One major contractor has outlined .the required skills as- 
cludlnf ^h='°J-f f "^^f design; (2) sampling; (3) Instrument design, in- 
with "t.tf/n ^ '"^"^1 clearance process; (4) relations 
with state and local education staff and school personnel; (5) field opera= 
tions, (o) data processing and computer applications; (7) statistical analysis^ 
(8) writing and editing; and (9) management of research. These capabllitlL " 
In ^'T '^ by staff member, but they must be reflected 

in the evaluation team as a whole to ensure high-quality evaluation practices. 
ana1ysa""nd%r ^^^^^^ -1-"^^ execute- specif ic tasks such as secondary 
and^^^h'/? °^ advisory panels with expertiae in both substantive 

and methodological areas is a standard practice. 

lead/tr^h%°f/°"^ ''"-"^ ^"'^ interviews with selected contractors 

leads to the following generalizations concerning the characteristics of con-" 
tract research teams and how skills are matched to the required tasks, "project 
P -rectors and Principal Invest igators==the chief managers and overseers of the 
Ivnf^" "® typically senior professionals with many years of training and 
experience xn research and management of large-scale studies. They have pre- 
viously served in a variety of capacities within such institutions as univer- 
sities governinent agencies, and other research organizations. They have pub- 
lished frequently and participated in professional organizations. Middle and 
junior level professionals, assigned to conduct and supervise such specific 
evaluation tasks as the development of instruments or analysis strategies 
^eL^^^^h hS""'^ doctorates in researcher elated fields and were recruited for 
the^r methodological expertise. These individuals have also participated In 
anri.'lM^S" of research efforts, and many belong to professional societies 
who ?n?pi ? "'^ their respective fields. Data collectors and research assistants 
who interview respondents, code and keypunch data, and participate in site 
visits are usually graduate students, baccalaureates wl?h somJ^qJant itative 
background, or individuals (e.g., former teachers) with experience in local 
and state education agencies. What this brief sketch should suggest is the 
manner which capabilities are matched to the tasks required in an evalu- 
ation contract. Competence rests on the goodness of this fit. 

It should also be noted how the size of the team fluctuates throughout 
its oeak'^ ^ '""^ contract. During data collection, total staff size reaches 
Its peak One example given was that of a two-year, $1 million study where 

data cofL''^ 5""°" ^""^^ proposal preparation to 70 Individuals during 

data collection and returned to its original number for writing of the final 
report. Smaller contracts (e.g , $250,000) involve the " same numbers during 
dur^-n^rf "^Ses of the research and approximately 2 0-30 personnel 

during data collection. This flexibility is usually accomplished by hiring 
temporary personnel, and the remainder of activities requiring advanced ex^ 
percise are performed by full-time permanent professional staff. How research 
such as this could be accomplished under present civil service requirements 
and agency staffing levels is unclear. q ireinents 
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Problms associated with the contractual process have been noted in 
these interviewed First, the procurement cycle with its frantic fourth 
quarter has been blamed by some contractors for affecting both the quality 
of RFPs and the proposals submitted* One contractor stated that during 
this final quarter the time to respond was reduced to only 10 days as com^ 
pared to the typical 4-6 weeks permitted during the rest of the year* 
Related to this have been complaints regarding the selection and award 
process* One concerns Inequity in the awarding of contracts, Howeverj 
we have refrained from presenting tabulatlcns which indicate the nijmbers 
and slEes of contract awards across various firms. Such aggregate figures 
can only be misleading j unless background knowledge concerning the types 
of evaluation activities requested^ the resulting professional skills re- 
quired, the length of the contract, and the quality of all submitted pro- 
posals for the contract has been incorporated* Obtaining this information 
would necessitate a separate study in itself. Problems were also mentionedj 
especially by contractors in smaller firms, concerning the time from issu= 
ance of the RFP to the notification of the award* One example was given 
where this process took 4-5 months* This can result in situations where 
staff specified in the proposal have already become involved in other projects. 

Delays in contract payment were also cited as probleimatic ^ especially 
by smaller firms. One comnon complaint was that the government and con- 
tracts officers sometimes acted as if they were unaware of the contract 
regulations themselves* Time constraints prevented us from adequately 
investigating these issues* However, one concern on which we can comment 
is the need for professional incentives and reGognltion of quality work 
produced by contractors* It has been expressed that while the results of 
contractors' efforts are cited in agency adrainistrators ' speeches and in 
other public addresses ^ the performers of the research may or may not be 
mentioned* The OE tonual Report Is an admirable example of actually tar- 
geting the indlvlduals"both contractors and monitors—responsible for the 
specific studies. Coupled with this issue is the observation made by many 
contractors that the periodic uproar over contractor abuse can be very de-- 
moralizing to the Innocent firms. Contractors felt that the extent of any 
abuse should be determined and the problems associated with the contractual 
process which affect the quality of research should be thoroughly examined. 

The tepabilities of Outside Contractors Employed by State and Local Education 
Agencies 

The decision to hire an outside contractor usually stems from the need 
to obtain an ^-objective" scrutiny and/or specific expertise for the evalua-- 
tion* However J, the existence of an Independent evaluation unit with com- 
petent staff has increased an agency's own ability to collect data and con- 
duct special studies. Given administrative independence, the need for out- 
side objectivity becomes less clear. This is especially true in light of 
the possible pressures imposed on outside contractors to produce positive 
results which will ensure their reemployment for the following year* 

In fact, many program and evaluation personnel prefer having the evalu^ 
ation unit conduct the evaluation* It is felt that evaluation staff better 
understand the program^ are more available to provide advice^ and can more 
quickly remedy any problems which may surface during the course of the evalu- 
ation, 
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Contractors are typically employed by SEAs to conduct special studies 
of federal and state protrams. For example, in State VI, due to an already 
overworked staff, they have been used to perform evaluations of state pro- 
grams which were requested by the legislature. Title VII and Vocational 
Education Programs In State ll=-an SEA without a distinct evaluation unit- 
hired contractors to design state-wide evaluation systems which could even- 
tually be used to assess the overall Impact of these programs. Less fre- 
quent are those situations where contractors are employed to actually aggre- 
gate and analyze data for required reporting to federal agencies. The one 
instance we did find concerned an SEA which had only one district, making 
the SEA and the LEA essentially one and the same. 

In school districts without discinct evaluation units, contractors can 
be hired to cona.-ct required evaluation efforts— primarily for the collection, 
analysis, and raporting of Title I evaluation data. For example, the Title I 
program in Site A has employed outside contractors to obtain the data for 
Model A reporting. Districts in State 11, which typically are too small to 
..ave their own evaluation units, used outside contraccors for Title I evalu- 
ations. In this case, the SEA "recommended" this strategy, suggested that 
the contractors also be employed to collect process-oriented information in 
addition to that required by the Title I models, and encouraged that LEAs 
should hire different contractors every three years to ensure objectivity. 
Contractors are seldom employed to collect required data for Special and 
Vocational Education programs due to the type of information which must be 
reported. When there is an evaluation unit but its primary role is "tech- 
nical advising" rather than actual evaluation performance, programs hire 
outside contractors for special small-scale studies. For example, the 
Special Education Program in Site K used a contractor to conduct a needs 
assessment of a specific handicapped population. 

The one exception to these general practices occurs with regards to 
Title VII program evaluations. Regardless of accessibility to district 
evaluation unit staff, local Title VII programs typically hire outside con- 
tractors to evaluate their programs and prepare the report for submission 
to the federal agency. Although there exists no requirement to use outside 
contractors, district program administrators perceive this to be the case 
and express disgruntlement over the arbitrariness of this "requirement." 
Given the reasons previously stated, they see no reason why their evaluation 
units, especially when they are independent and competently staffed, can 
not conduct the evaluation. 

While we cannot fully delineate the specific capabilities of outside 
contractors nor derive estimates of the number of "incompetent" contractors 
some general observations can be provided. Based on our site visits 
review of contract proposals and reports, and Interviews, some SEAs Ld large 
urban school districts use the same reputable firms as employed by the 
federal sector. Lists of contractors also include university research firms 
(e.g.. States IV and VI) . 

Stories of successful experiences in smaller, LEAs are less frequent 
nfnnf contractors submitted inadequate two-page reports 
for a $2,000 contract, refused to rewrite an unlntelliglblB report since 
payment had already been received, failed to ever submit a report, and used 
in,nppropri3te methodological and statistical procedures. 
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There are a number of f actor e which may lead to the lower incidence of 
contractor ineptitude we found in SEAs and large LEAs. Contracts issued 
by SEAs and large urban school districts can often rival the size of a typical 
federal contract. For eKamplej the average yearly award for 5 current external 
evaluation contracts in State VI was $163 ,000, and total awards averaged 
$552^000* One. large urban school district awarded a $115^000 contract for its 
Title I evaluation. It is not surprising that these contracts were bid on and 
awarded to firms who are also major contractors ±n the federal sector* 

Even the instances we found when contracts were considerably smaller 
(e.g., $5j000) did not involve unqualified individuals* One reason for this 
was that SEAs tend to have better procurement and monitoring procedures* 
For example, in State VI^ the research and evaluation unit devoted 290 person- 
days to review and selection of contractors and monitoring their progress. 
In State II which did not have an evaluation unitj individuals In the agency 
x^ith research and evaluation skills assisted in the preparation of the RFP 
and selection of the contractor* The bidding and review processes were 
structured so that the criteria for selection did not focus on cost issues. 
Payment was spread over the course of the contract and regular progress re-- 
ports were raquired. 

Problems occur when either contract sizes are extremely small and/or 
there are inadequate selection and monitoring mechanisms* For example^ ixi 
many districts^ we found the average si^e of outside contracts to be $2,500^ 
and it was reported that in one state Title VII contracts were usually 
$500-$600. These levels determljie the type of bidders which may respond* 
Large firms simply do not make a practice of responding to such small-scale 
contracts, thus^ allowing the market to be populated by smaller and lower 
quality firms. This is not to suggest that small firms are necessarily 
substandard J but there is a developmental pattern which may exist* As small 
firms acquire a reputation for quality performance, they are more likely to 
seek and be rewarded larger-scale contracts which can better support staff 
and attract competent individuais* In addition, competent firms often recog- 
nize that the quality of work suffers when there are not adequate monies de- 
voted to the evaluation. 

The following descriptions describe the types of contractors and firms 
previously employed by the sites we visited- 

Firm A . This firm had 6 full-time professionals and specialized in four areas^- 
Bilingual^ Title I, Gifted Children^ and Special Education evaluations* The 
firm was created in 1972* On average, it conducts 25 evni nations per year 
and the typical contract is $4^000. The Director has ha xperience in state 
and local evaluations and served as a consultant to some iaderal agencies. 
The majority of professionals have degrees in education and have completed 
some graduate work* However, the most important requirement in staff is 
perceived to be their previous teaching experience in elementary and secondary 
schools. Technical skills are not viewed as important since the Director feels 
that good evaluation resembles "pedagogy-* rather than research. He prefers 
to personally train his staff in evaluation methods and reported that even 
his secretaries assist in analyzing the data. 
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lirml. In contrast, this firm is over 10 years old and only miploya one full- 
tme professional. This is because the organization is composed entirely of 
university professors who do additional consulting. All the Individuals teach 
at one university and have their doctorates. There is one statistician for 
analysis tasks, a person trained m educational administration to manage the 
organiEation and negotiate on contracts, and several curriculum specialists 
who participate in on-site visits and monitoring activities. The firm never 
bids on contracts which are under $5,000 or ones in which they are not in- 
terested. One reason they can afford to do this la that outside contracting 
IS only a part-time Job. Since the staff all have advanced training however 
their fees are considerably higher than those requested by Firm A— $200=250 ' 
per day as compared to $4S-$8S dally. 

The differences between these two firms are not often clearly perceived 
by selection boards in some school districts. One reason for this is that 
they often do not Incorporate evaluation expertise Into their composition 
Program directors are typically not trained in evaluation methods which is 
precisely the reason why they are seeklni outside assistance to conduct the 
evaluation. Consequently, not only selection but also monitoring of con- 
tractors can be problematic. School Boards and Superintendents may only 
complicate the process by inserting political and personal variables into 
the selection process. For example, in Site A whern the Board makes the 
final decision on whom to hire, the Title I Program Director can only make 
recommendations and due to district politics feels that his suggestions are 
often the kiss of death" for a particular firm. In one site It was stated that 
the Board tended to glance only at the price tag of the proposal rathelthan^" 
the qualifications of the applicant. 

There also exist few sources which local districts and programs can draw 
upon to assist them in their selection process. Evaluation units can assist 
in these matters. State II has developed suggested practices which districts 
should follow in procuring and monitoring outside contractors. Although the 
SEAs can often provide recotmendations, they cannot compose lists of '•com- 
petent'; ind^ivlduals and delete any who may be questionable. This has re- 
sulted m the situation where any individual can call an SEA and ask to be 
placed on the list of available consultants. These factors strongly sug- 
gest the need for the development of standards and guidelines which can be 
used to inform local districts as to how procurement should be handled 
what criteria should be used in Judging the quality of proposals, what 'the 
district s rights are in the contractual arrangement, and how the monitoring 
process should be conducted. 



Footnotes 

Full references to documents cited In this chapter and others are given 
in Section 8, References, by authors. The text Identlflea individual 
authors where possible and the organization that produced the document 
otherwise. 
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CHAPTER 5. HOW WELL ARE EV^UATIONS CARRIED OUT? 



Robert F. Boruch, David S. Cordray 
and Georgine M. Pion 



Just because you can't lay an egg 
doesn't mean you cannot criticize 
an omalette. In fact, being able 
to lay an egg may disqualify you, 

Hernando Hlbachi 
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This chapter is prlmariiy concerned with the quality of evaluations 
and factors that Influence competent execution, analysis and reporting. 
In Section 5.1, performance guidelines and standards are described. Section 
5.2 reviews the quality of evaluations within the framework of guidelines. 
Section 5.3 addresses some critical Issues, including the need to assess ' 
program Implementation, the Importance of the evaluation design, options 
for promoting randomized field tests and the need for critique and secondary 
analysis. The final section discusses constraints on quality of evaluations 
and suggests options for relaxing them. 

5.1 STANDARDS FOR ASSAYING QUALITY OF EVALUATION 

Over the last five years, a variety of efforts have been made to develop 
guidelines on good practice in evaluation. Two of the efforts described here 
focus on education— the uSOE-NIE's Joint Dissanination Review Panel and the 
Joint Committee on Standards for Educational Evaluation. Two focus on evalu- 
ation more generally, covering education, health, law enforcement and other 
areas. These organized efforts are relatively recent, partly because the 
whole field is new. But individual university reiearchers have been working 
on standards since the 1960's. There have also been earlier efforts, by the 
Phi Delta Kappa National Study Committee on Evaluation, for instance, to 
describe the field of evaluation Including standards of evidence. 

The Various Sources of Guidelines and Standards 

The JDRP Ideabook . written by K. C. Tallmadge of RMC Corporation in 
New Hampshire with substantial advice from NIE and USOE Staff, was issued 
in 1977 for the USOE-NIE Joint Dissemination Review Panel's use. ^ The manual 
Is a guide for local program developers on the criteria used by JDRP in 
judging the worth of their innovative programs and in judging the evidence 
offered in support of a program's effectiveness. Innovations approved by 
JDRP become eligible for dissemination support. According to Mary Berry, 
then Assistant Secretary for Education, the guidelines were developed at' 
the request of practitioners. 

The Joint Committee on Standards for Educational Evaluation is chaired 
by Daniel Stufflebeam of Western Michigan University and consists of repre- 
sentatives from twelve professional organizations with interests in standards 
development. The organizations include the American Educational Research 
Association, the American Psychological Association and others. Draft 
standards were developed during 1976-1980 a.id reviewed by an external panel 
of experts. The fifth and final draft of the standards will be edited by 
the Joint Coiranittee and published by a commercial publisher. Contributors 
expect the standards to be used in college training. Their plans call for 
continuous revision by a standing coiiinlttee whose operation will be supported 
through sale of the monograph. The Committee's work has-been supported by 
grants from the National Institute of Education, the National Science Foundation 
and the Lilly Endowment, ' 
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The Evaluation Research Society's draft Standards for Program Evalua- 
J" Novanber 1979 by a connnlttee consisting of IRS members. 
Keith Marvin of the U.S. General Accounting Office is chairperson of the 
committee which includes university researchers, federal agency staff, and 
GAO representatives as well as private contractors. The origins of the 
standardfl lie in rather strong Interests In the topic by some ERS monbers. 
me draft haq been constructed partly on the basis of other similar docu- 
ments, notably the JD^ Ideabook. the draft monograph of Joint Comnittee on 
Standards for Educational Evaluation and the U. S. General Accounting Office 
draft guidelines. 

The U.S. General Accounting Office released exposure draft guidelines 
tor assessing quality of impact evaluations In 1978. The Justlf icatlon 
for Issuance was the GAO's general mandate to oversee federal program 
evaluations and the need for clarity m discussing quality. The guidelines 
were developed by GAO with consultant assistance, and they are dedicated 
primarily to impact evaluations rather than all types of evaluation. None- 
theless, there is clear topical overlap with other guidelines discussed here. 

The American Statistical Association has not Issued general standards 
or guidelines on statistical practice. However, a recent attanpt to review 
quality of a small number of surveys, by Barbara Ballar of the Census Bureau 
and others. Involved development of terse standards which require technical 
expertise to apply. There is some overlap between GAO guidelines and these 
criteria. But because the ASA criteria are not advertised as guidelines or 
standards, we have not considered than here. 

Differences and Similarities Among the Guidelines 

All the guidelines have certain features in common, topics which the 
evaluator is encouraged to address. These topics Include: description of 
the program under evaluation, the rationale for choice of evaluation plan 
and measures, an explicit plan or evaluation design, the inclusion of data 
on reliability and validity of measurement, full and balanced reporting 
linkage between evidence and conclusions, and thoughtful interpretation 'of 
results to major audiences for the evaluation. In brief, most guidelines 
can be classified as bearing on Accuracy. Utility. Propriety, and Feasi- 
bility, in the way the Joint Committee on Standards classifies its standards 
Some ethical standards are present in each set of guidelines. Attention to 
individual privacy of respondents and to the public's interest in access to 
reports and to statistical data for competing analyses, for example. Is 
explicit . - » f , a 

There are differences In detail among the guidelines. The Joint 
Coimnlttee on Standards puts more emphasis on context of the evaluation and 
practical procedures than others. Brief Illustrations of proper adherence ■ 
to each standard are plentiful in the JDRP Ideabook and the Joint Conmittee 
Standards. Fewer examples appear In the GAO document and virtually none 
appear in the ERS document, but neither of these were financially supported 
at the level of the other two. 
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Commentary on and Jlxperlenee jjlth Gu id el in 



es 



Each set of guidelines has been Issued by its sponsoring grou- with 
a request for conunentary and criticism. Tt.e USOE-NIE JDRP Ideabook was 
issued earliest and adopted for routine use only in the Joint Dissemination 
Review Panel operations. Three of the six federal agency staff who are 
well acquainted with the Joint Dissemination Review Panel and who were 
interviewed as part of this Project generally say that the guidelines have 
unproved the quality of presentations to the JDRP. 

One of the few field applications of such guidelines was undertaken 
in a recent field survey by Catherine Lyon and others at UCLA's Center for 
the Study of Evaluation. They condensed guidelines developed by individual 
academic researchers and by the Joint Committee on Standards for Educational 
Evaluation to obtain a reduced list of thirteen standards. The reduced list 
does not: differ appreciably from the list, given earlier, of elanents connnon 
to standards discussed here. It was used to judge the quality of over 100 
reports issued by evaluation units in large school districts. More remarkable 
the center staff investigated the reliability of judges' ratings based on the 
standards. After brief training in the use of standards, Interrater reliability 
was tound to be high, ranging from .80 to 1.00. 

The Joint Committee's standards have been field tested at least in- 
formally in 29 sites. Generally field tests involve trying to apply the 
standards to evaluations or evaluation reports. Four "National Hearings " 
meetings of professionals, have been held to discuss these standards and* 
to improve than over a two-year period. Conmentary from the Hearings and 
letters containing reactions to the standards are generally encouraging 
though there are exceptions. Both have been compiled in manuscript form 
for distribution by Jeri Ridings at Western Michigan. 

The ERS standards are undergoing review by interested members of the 
Society and were reissued in May 1980. The effort is supported by membership 
rather than foundation funds. j- f 

In our field research, we encountered no substantial familiarity with 
standards or guidelines at the local level. School boards, program directors, 
and others at the local level are likely to trust the evaluation rather 
than to examine it relative to formal standards of evidence. Congressional 
staff manbers may be less likely to trust evaluations, but even the GAO 
standards are not especially salient for agency staff members or Congressional 
staffers. The unf amillarlty may be attributable entirely to the fact that 
development of guidelines is very recent. Nonetheless, it does seen sensible 
to make tentative guidelines generally available and to make sure they are 
understood. This is especially crucial at the federal level to reduce 
unnecessary argument about what quality means in this area. 

Debate Over the Utility of Guidelines 

Judging from the experlOTce of the Joint Dissemination Review Panel 
the JDRP guidelines are useful in telling program developers what kind of 
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evldrace on progrffln effectiv^esa is credible and what ktod Is not. The 
context la special In that adherence to the prescribed stmdards facili- 
tates favorable review md dissemination of the developer's product. 

There is some debate about usefulness of guidelines of standards 
outside this contact. Daniel Stuff lebeam, chairman of the Joint Committee 
on Standards for Educational Evaluation, for ^Mple, catalogs the following 
as criticisms or alleged shortcomtagsi The standards may promote a field 
that is not needed and legitimate practices that may be haraful. They may 
concentrate on minor matters^ mcourage bad practices which are not explicitly 
proscribed, md topede imovatlon. Thm alleged b^eflts Include maktag 
language and definitions clear enough to facilitate coimunlcations and 
establishing a cosmon frame of refer^ce and acceptable rules for dealing 
with evaluation problems. Standards may also serve as a basis for monitor tag 
evaluations and to enhance credibility of the process and product. 

Stollarly divided optoion surfaced in our Interviews with federal 
agency staff and Congressional support staff. A Congressional Budget Office 
staff m^ber podnted out that guidelines can be constraining: Some evidence^ 
regarded as poor under smisible technical standards , might be CTitlrely ad- 
equate for sOTfi pollny purposes* Moreover, guideltoes are a coarse simpli- 
fication of what we tmderstand about quality of evldimce and that simplification 
may be regarded as suff Icl^t by evaluators who could otherwise do much better 
work. The major risk, according to a federal agency staff mmber is that 
standards can only be useful If there is some agre^mt on thra by competent 
evaluators . 

Our general conclusion is that there Is a fair anount of agreemmt 
among groups working on guidelines about what should be considered in an 
evaluation. The guldeltoes th^iselves are sufficlmtly promising to 
warrant their being field tested and encouraging their use* 

They are also sufflclratly promistag to justify their betag explained 
to interpreters and users of evaluation results* At the local level, this 
includes program directors, superlntendmts, school boards, and the like — 
if thme is sufficient tot er est. At the federal level, this includes 
program executives, Congressional support agencies and staff. 

It Is not clear that guidelines are appropriate for Incorporation into 
law or regulation: Their function is advisory. It Is smsible to assure 
that legislative and regulatory language is consistent with guidelines. 

5.2 QUALITY OF EVALUATIONS 

Tlie quality of evaluations can be assessed at dlfferrat levels of effort. 
These range from overall judgments made by experts, through systraatic 
assessment of the contents of specific reports, to reanalysis of the raw data 
generated by an evalimtion. The expertise and ttoe necessary to perform 
intensive reanalyses is sufficimt to warrant a step-wise approach to quality 
assessment—that is, successively more stringent levels of review should 
be considered. If a study "passes" the simple review procedure, it becomes 
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a candidate for more rigorous assesBnient. The rationalfi and justification 
for this approach is simple; serious problenis can generally ba assayed 
through an expert review of the report. Some problems are so damaging that 
more detailed reanalysis is simply not warranted. 



Previous Examinations of the Quality of Evaluations 

Because evaluation Is a rather young enterprise, and because standards 
Imve only recently been developed and field-tasted, it should not be surprising 
that comprehensive assessments of quality are rare. Consequently, we have to 
rely on a variety of sources. 

The General Aceountlng Office. In April, 1975 the GAO surveyed program 
representatives at local and state education agencies regarding the quality 
of evaluation reports produced by state and local agencies for Title I III 
ll^r ^ national statistical sample of 832 local school districts and ' 
all SEAs were requested to complete a questionnaire survey. Respondents were 

familiar with each program. GAO's questions about 
credibility of the findings referred to the respondent's confidence In the 
soundness of the methods and reasonableness of conclusions. The questionnaire 
defined qualification of findings" as the extent to which the results were 
properly qualified, assumptions made explicit, and the evaluator described the 
conditions under which the findings were not applicable. 

Table 1 presents the percentage of program officials who rate the two 
aspects of quality as "adequate or better." In the survey, the local and 
state program officials were asked to rate the quality of local and state 
reports, generating cross-level judgments of quality. That is, GAO obtained 
local views of state reports and state views of local level reports and 
ratings of quality pertaining to evaluations conducted at the same level of 
govermnent as the raters. 

The GAO findings are interesting on two counts. The judgments pertaining 
to the quality of the evaluations are consistently higher for same-level reports 
than for cross-level reports- indicating that judgments on quality may be 
confounded by the utility of the information at each Ict^.1 of government Of 
more relevance to the issue of quality, however, is the fact that even those 
very global assessments are not complementary with respect to the quality 
of evaluations. The highest rattag of "adequate or better" was ascribed by 
only 694 of local program officials to local reports on Title III. The lowest 
level of quality was ascribed by State Officials to local Title 1 reports' 
only 31% of the reports were rated as "adequate or better" for the manner 'in 
which evaluators qualified their results. The remaining judgments are evenly 
distributed between these two extremes. 

Lyon and others. Cente r for the Study of Evaluations . Lyon and others 
requested evaluation reports from each respondent In their survey. They 
received 116 reports which were then reviewed according to the presence or 
absence of criteria considered to be necessary elmients of an evaluation. 
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Table I 



Percentage of Program Officials Eating Quality as "Adequate or Better" 
for State and Local Evaluation Reports i GAO (1977) 



Credibility 



Local View 



Source of Report 

Local reports 
State reports 



State View 



Local reports 
State reports 



Qualification of Findings 



Local view 



State view 



Source of Report 

Local reports 
State reports 



Local reports 
State reports 



Title I 

62% 
55% 



33% 
41% 



Title I 

59% 
50% 



31 X 
53% 



Program 



Title III 

69% 
64% 



40% 
60% 



Program 



Title III 

66% 
61% 



40% 
53% 



Title VII 

53% 
39% 



Title VII 



61% 
41% 
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Their summary of the percentage of items appearing in the 116 renorts 

nrLf f f ^'^"^ Evaluation reports often do not uniformly describe 

program implementation, address issues of reliability and validity of the 
bfL^°^'f ^ evidence with conclusions. What a^eared to 

^e best about reports is that the great majority "identified program and 

Ind " ul"s'"1^Sf 1^%'^^' collection sources, data analysis procedures 
and results. These last three items are uniformly required in the state 
reporting forms filed by LEAs and labelled "evaluiion\eports!" If these 
reports were part of the sample reviewed by Lyon and otheL. we concur thit 
they are quite sparse In tl«t a narrative description of the program recoL ' 
mendations, and conclusions are rarely present. Program, recom 

Other Estimates of Quality. The approval rate for the projects sub- 
mitted to Che Joint Dissemination Review Panel (JDRP) can be viewed as an upper 

ta" subm^ r' ^'f direction of higher quality due to voL" 

tary submission, of the quality of outcome evaluations at the local level 
specifically, 421 submissions to JDRP, 245 (57.7%) have been approved Dis- 
cussion with staff members affiliated with JDRP suggest that the quality of 
«.idence brought before the panel, in recent years has Improved, nevertUess 
the approval rate is substantially below 100%. evertneiess. 




thr<r...n^r . - "promising" programs. Although 

^fl^r.M ?A "^"'i^ exclusively based on methodological con? 

twelve ipr^% 1 ^lY"^' P""^ .'^^"-^ screening criteria. Ultimately, 

twelve were selected as case studies, "mainly because of the presence of 
more complete data and documaitatlon" (p. 2, Vol. III. 1979). 

T il ^^^^ studies by Campeau and others at AIR and Wargo, Campeau and 
Tallmadge also through AIR, report considerably lower perce^tageHf studies 
amlnerf?? methodologically sound. Campeau and others, ex- 

amined 175 programs in Bilingual Education, funding only 8 programs "luLed 
to merit Site visiting." For Wargo and others, thi ratio ol sSssful * 
evaluated projects compared to the number that were reviewed was considerably 

A recent attanpt to isolate exemplary Career Education Activities was 
:;ofl^atf " f ""^'^^ ^^'^°"8h an OED contract to aIr! Ihrough 
nominations made by a variety of personnel at federal, state, and local agencies 
the projects were identified. Reports were'soliclted from ' 

the directors of nominated activities. A three-phase review procedure pro- 
u^tSaLlvIn passed most of AIR'b criteria (but not all) 

ultimately 10 were submitted to JDRP for approval. (11 actually passed in 
the criteria, but one had already been submitted to JDRP). Seven of the 10 
projects were approved by the panel. 
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CHAMCTERISTICS OF DISTRICT EVALUATION R^ORTS 
Percent of reports (n^ll6) with ^ch characteristic* 
Sourcei Lyon et al. 1978 (CSE, 



1, The program or product or other object under study in 
the evaluation is described so that its objectives 

are clear, 38% 62% 

2, The prograia or product or other object under study in 
the evaluation Is described so that the form of Its 

actual implementation is clear. 17% 83% 

3, The purposes of the evaluation are described | purposea 
fflay be stated in terms of the evaluation questions 

or objectives, ^ 53% 47% 

4, AudlOTce(s) for the evaluation information are 

identified. 35% 66% 

5, Participants in the educational program and the mal^ 
uation study, and how th^ were selected for partici- 
pation, are described. • 70% 30% 

6, Data collection sources ^ si^h as testSj records^ or 

obaervation foms, are Identified, 92% 8% 

7 * The data collection sources are comprehensive mough 

to answer the evaluation questions, 46% 54% 

8* Thm reliability of the data collection sources ^ and the 
validity of the data collection sources for the purposes 
intended is described. 10% 90% 

9. Data analysis procedures are described or are evident 

(as in detailed tables)* 81% 19% 

10. Evaluation results are described or presented, 97% 3% 

11, Conclusions or recommendations are drawn from the 

results. 52% 48% 

12. The congruence of the conclusions with the inforiQa- 

tlon provided is described or evident, 28% 72% 

13, The written presentation of whatever was done 1ji the 
evaluation is clMr (evra if standards above were not 

met), 65% 35% 



[Percentages may not add up to 100% b^auaa of roandlng errors.] 
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We conclude from these assessments that the procedures anployed In the 
majority of evaluations are insufficient for judging th. effecL ofthe 
projects or programs on children. Among the items consistently reported as 
reasons for rejecting a project from consideration as an exempLry actlvitv 
::au^JLn1eslJns!"^"^^^^^^ ^"^^-^ ~ .ata^anff Sel"^ 

A Case Study of Three Contracted Evaluations 

In the course of our field visits, studies containing a variety of 
flaws were uncovered. While extenslveness of the flaws detected In some 
aspects of this case study Is not representative of the reports generated 
by the research units we visited, it Is nevertheless informative as to the 
types of errors that appear. 

The ease Is based on evaluations of a bilingual education program in 
an urban school district. Responsibility for evaluation was given to a 
contractor, selected through an annual competitive bidding process. Each 
year a different contractor undertook the evaluation. Over the three years 
there vras gradual improvOTient , attributable perhaps to better contractors, 
better selection procedures, or more sophistication In the LEA. 

^ The evaluation for 1976-77. done by a tonhattan-based firm, is clearly 
the worst of the three. The bHingual program or its objectives are not 
especially well described and the objectives of the evaluation are rhetorical 
Descriptions of procedures, sources of information, and the like are weak 
The attention to negative aspects of the program is negligible. The technical 
aspects of the evaluation are inept at best. The following phrases appear 
in the report. Despite the technical Jargon, they are misleading at worst 
and meaningless, at best. 

. "Any increase, whatsoever, in percentile ranks is a significant 
increase because the percentiles are adjusted per age." 

. "School B showed an average increase of 46 percentile ranks 
while the School A children showed an average increase of only 
15 percentile ranks," , 

. "Anything less than 95% certainty is not considered to be 
significant growth." 

. "SevKi youngsters were pre and post tested... the p value of .001 
is the highest one can possibly measure with inferential statistics 
of this nature. 

. "Inferential. statistics analyzes trends..." 
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The igJI'-^TS evaluation is a bit of an toprovemmt. The program and 
its objectives are depicted in tabular fotm^ the objectives of the 
evaluation are described in a general way. Achievement tests are described. 
But the participants in the itudy aren't identified, reliability of data 
sources Is not mmtlonedj and specific data analysis procedures are not 
described. The prose is drMdful, We are told that "evaluation is a formal 
program* . ,whj£h can then be fed to personnel. ^ thwe are lots of references 
to %eaningful'' results, some thanks are offered to the "Project Coordin-* 
atoress*" The tectaical conuaon sense Is negligible, e.g., "all of the scores 
...are statistically significant*.,," "t-scores were used whra the numbers 
were not large mough to use z scores. A t-score is widely used standard 
score in which the mean is 50 Md the standard deviation is 10," 

The contractor was candid in recognising that some stud^t achievOTent 
objectives ware not achieved. However, no at tmpC was made to e^lata any 
of these. The Increases in achlavCTaent ware attributed to the program without 
any recognition of competing eKplanatlons. 

The 1978-79 evaluation was done by another Itonhat tan-based firm for 
about $5000. It involved the following activities: 

, Field observation of progrm classrooms in three periods to 
determine whether teachers adhered to reasonably sound 
p^agogical practice In organlgljig and teachtog studmts. 

. Interviews and questionnaire surveys of program staff, school 
and district administrators were us«i to detemlne their satis-- 
faction with the program, character and quality of mmiagement. 

. Interviews with parents to determine the level of their 
participation 3 existence of required advisory groups, and 
parental opinions. Group evaluation sessions with progrm staff 
to obtain some sense of accomplishments and problras, 

. Review of program doeumrats to determine if pertlnmt materials 
were availablei how new pertinent material might be obtained 
from other districts, 

. toaiysis of student achievement and assessment of changes dji 
self -concept and knowledge of culture. 

This report meets reasonable standards for evaluation in thati program 
objectives are reasonably clear, the program itself is described, the purposes 
of the evaluation are described, respondent groups are identified and pro- 
cedures for collecting information specified. There was some clear attmtion 
to validity of the information, results are described, conclusions were 
drawn and recoimendations were made. The wrlttm presentation is clear. 
The contractor's judpaents about whether students benefited acadenically 
from the program are based on measurement of achlevraent before entry and 
after nine months of the progrm. The procedure used is reasonably clear. 
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generated incraases'l^^'o't waSi^.^f JL"^^^^^ 

correct in stating that the nroeram ^^^^ be 

but the data are not sufficiSt- ^ Improving achievanient 

which^«:i:n ^d r^i^nJ-fJI^li^i.tif ^-.ua,e. 
It was candid in Identifying fSf in "^h^ '^"^ stressed, however, 

about conflicting testimo^' The latter In 'i'h'™ '^'^^^ ^'^P"^" 

agreed that their children's sains w^f^.^ ^ -u^'-^Snizing parents who 

who did believe the progrL wfHn.tr^entar 1h '^''^'^ 'hose 

reporting vague lines of conmunlcatlnr^"^ The report was candid in 
.he way this .ig.. 

^^^^^^SMkJlandar^^ 

Of cr^Sf "-^i-^l evaluation is the establishment 

Gene Glass describes the ffuibuity If °t1f '° P^^ram successful, 

relate to mastery level or mlnimn^L ! " Procedures, especially as they 
the pervasiveness of these prac^lLf "^^f evaluation. Because of^ 
designs, such criteria dLervfat iL^^^"*- "'"^^^^ ""Pl^^ ""h other 
is that setting values such as "SOf of ^h^"";.?^™-"'* ^^^^^ argument 

grade level" is simply an insufficient ^^^^^f^ren will read at the third 
of a program. As a'basls ^^^dj^' these :L'f*f« effectiveness 
prugram can be declared successful o^'n» standards are too arbitrary. A 
criterion at low or high end of tL "^^"°«ssful simply by setting the 
encountered may help tfclaJify t^e Ti^^lf- ^ ^e • 

of assessing effectlvness. ^ difficulties associated with this method 

Example 1 "The Title I participating students will exhibit 

a mean gain of 1 NCE In Math-enumeration on Che C. A.T." 



Example 2 



Example 4 



self image and interpersonal relationships. 

-on^^^'n^i^:;^'^^-^^- " particularly troublesome m the 

knowledge is to" what'L an acce'Sle If'!""? ''^^^ ^^'l- P'^i- 

programs, for example Title 1 the vear ° Perfopnance. Within ongoing 
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sueh etandarda are insufficient for Judging program success. Testing level 
of competeicy before and after the progrMi (Examples 1 and 2) is an im^ 
provCTent over the after-only strategy (Examples 3 and 4). But is still 
insufficlmt for attributing the gain to the program. Other competing 
Explanations such as norinal growth are ^ ^lULuglble in accounttag for 
the galnSs as the program. An example of when the criterion based assess-- 
ment Is a more valid basis for judging effectiveness is whM a comparison 
group has also been assessed^ That is, a standard (e,g,^ 80% success) set 
by an educator, advisory eoimcil, and/or through eKperieice can be mean^ 
Ingfully interpreted only when program participants' performance is assessed 
against the performance attained by those who did not receive the program. 
However, even with the use of comparison groups, unless constituted through 
a randomization procedure that Is maintained throughout the duration of the 
study, evidence of success may still be questionable due to other rival in^ 
terpretatlone not controlled by the evaluation design, 

Impag t of Regu la 1 1 on s , Raqu ir amen t s _ an^ Leg isla t Ive Mandat e s on Evalua t ion 

The description of the regulations governing evaluation presented in Chaptei 
3 identifies issues that have Jtaplicatlons beyond the specific progrMis 
that were considered. 

Once said, it is obvious that the types of evaluation practices that 
are prescribed will influence the quality of evaluations carried out* It 
is important to recognize that some of the guidelines contain statements 
that are inconsistent with good research practices. For example, the Bi- 
lingual Basic grants to LEA-s regulations specify the use of comparison groups 
to estimate what performance would have been in tha absence of the program. In 
the next line^ "statistical and historical comparisons" are identified as 
examples of presumably adequate means of deriving such an estimate. These 
prccedures are notoriously subject to statistical biases and other pervasive 
threats to the validity of the conclusions. If details of this sort are to 
appear in regulatlonSs deliberate attempts should be made to have them re- 
viewed by methodologists so as to avoid encouraging the use of weak assessment 
strategies* 

For three of the four progrm regulations we reviewed, data in the 
form of test scores, head comits, type of service rendered ^ and so onffle 
to be gathered for the purpose of aggregation. Specific reporting require-- 
ments can yield data ^ich can be aggregated to the national level* For 
the recipients of comparable data (i.e., states and federal agMcies)* 
regulations serve a useful purpose. If on the other hand, the purpose of 
requiring LEAs and SEAs to collect information (and report it) is to 
sttoulate program improvment through the use of evaluation data, minimal 
reporting requirements (as ixi Title I) will yield little useful data for 
that purpose* That is, test scores, by thraselves* do not provide useful 
ijiformatlon as to why gains or losses were observed. On the other hand, 
for those local agmcies who have little interest in evaluation, regulations 
will serve as a mtoimim standard for compliance. If the regulations are 
too dmanding, given the available resources, report Ijig Is likely to yield 
^adequate data or require extensive technical assistance. 
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Judgtog from recent published commentary (Barnes and Glnsburg 1979^ 
Cross, 19791 Linn, 1979, and Wiley 1979) regarding legislative requests ' 
for the d«velopmait and implCTiHitation of evaluation models as part of the 
reporting requlrments, there appears to be a need for a formal review 
mechanic where legislative representatives, those persons devising the 
models, users, and critics can assess the Informational relevance of the 
product. Further, pilot tests, using a representative sample of sites (not 
volunteers) should be routinely conducted to assess the feasibility and 
desirability of fflnploylng evaluation models. 

The Evaluation Plan and the Proposal Review Process 

Direct grants usually require an evaluation component. In most cases 
these grants are awarded for the purpose of demonstrating the feasibility of 
an Jjinovativa educational progrMi or for providing special services. Ul- 
timately, those programs that are superior to traditional practices should 
be adopted by other nrtnpational agencies. In order to msure that the 
educational value c innovative program is understood, a well designed 
evaluation plan sho id be articulated. This would include, ideally, an 
evaluabllity assessment of the program, and the collection of process and 
outcome Information, Some of these are overlooked in practice. Further, the 
need for careful program planning and evaluation planning prior to the im- 
plementation process appears to be undervalued judging from characteristics 
of the proposal application process for direct grants. 

For example, regulations for direct grants to LEAs under Title VII 
(Bilingual Education) provide a sunmary of the point values assigned to 
each review criterion. As was Indicate earlier, for basic grants, the 
evaluation plan was allocated 15 of 110 possible points. Here, the evalu- 
ation plan contributes a rather insubstantial amount to the final point-total 
within the review process. Judging from this case, the selection of projects 
for funding seems to be more heavily weighted towards the substantive man- 
agerial, staffing aspects of the proposal. 

To assess the pervasiveness of these practices, we examined the review 
criteria for additional direct grant programs. These Included four programs 
under the discretionary grants provision In Vocational Education and eight 
direct grant programs funded under ESEA, Title VII. A si^ary of the point 
values ascribed to the evaluation plan and general methodological/evaluation 
considerations Is presented ±a Table 3. From the entries in Table 3, it 
Is seen that the grant application review process entails assigning between 
100 to 110 total points to each proposal, the number of criteria used ranges 
between 5 and 11 and the points allocated to the sufficiency of the evaluation 
plan (all but one progrm ecpllcltly mentions the evaluation plan as a 
criterion) ranges between 5 and 15, That is, at most, 15% of the review 
process Is devoted to the adequacy of the evaluation plan. 

Given the diverse meanings attached to the tarm evaluation, it may be 
appropriate to examine the regulations In more detail. If we consider 
any criterion that contains even the sliahteBt maition of a methodological 
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Table 3 



Weight Assigned to Evaluation Practices In Direct Grant Applications 
for Vocational Bducation and Bilingual Education 







Characteristics of the Application Review Process 






iota J. 
points 


Wlnlaium 
points 


Number 
of 

criteria 


Points for 
the 

evaluAt Inn 

plan 


Number of 
criterion 
•listing 

methods 


Total 
points 

methods 


Voca 


tional Education 














1. 


Program Improvement 


100 


SO 


11 


8 


2 


26 


2. 


Indian Tribes 


100 


30 


9 


10 


3 


35 


3. 


Bilingual Vocational 
Education Tralntag 


100 


30 


8 


12 


3 


37 




Bilingual Vocational 
Instructor Program 


100 


50 


10 


10 


3 


35 


BUlngual Education 














1. 


Basic Grants 


110 


70* 


7 


15 


2 


25 


2. 


Dmonst rat ions 


100 


NA** 


10 


7.5 


2 


12.5 


3. 


State Technical 
Aaslstanca 


100 


50 


7 


X5 


1 


15 


4, 


Support Services 
Pro j ecta 


100 


NA 


7 


10 


2 


25 


5, 


Training Projects 


110 


NA 


7 


5 


1 


5 


6. 


Short Term Training 


100 


NA 


6 


5 


1 


5 


7, 


SM Train tog 


100 


NA 


6 


10 


2 


40 


8, 


Schools of Education 


100 


NA 


5 


0 


0 


0 



^Established for 1978 
**Not available at this time. 
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issue, we find that, at most, three criteria are contained in the application 

tJibuti^n f ' ^«fi""ion of methodological cpncern yields a point dis' 

methodo^LJcal c^"" '"t^ ' '° ' instances the 

nJ^f consideration appearing in these other criteria are a minor 

«lanl If ""^""^^ ^^'8" specification of the management 

fi^it' „ th^ consequence, the liberal definition is likely to be an upper 
li^it on the relative Importance of methodological considerations 



cons J=Lr? % Z"-*"'' ^i"^' the minimum, necessary to be 

poLme thafofo' 'f^^t '° Giv^n these conditions it is 

iuuJl^ f pr ojects which are methodologi cally unsound "5oS ia Ji „ 

funding or at i^gt consideration for fun ding . P^.v^.,., .^p... J^^ ^^^^ 

S'irder"; T'f ^^^'^^ importance of Inltlarpranning 

m order to conduct a successful study. The weighting ach^. -a it fs 



Previous axperience in 
ice of initial planni 

« * jj-j , * * , ^ki^ weighting schraift it 

specified m the regulations, seems to impede early considSiion o" 
methodological issues. - uHsiaeracion ol 

Several options seem feasible. Given that these types of discretionary 
grants are made available for tasting new, innovative id las, it seems 
r^sonable to increase the ^phasis on the evaluation ...nL.^.u.. 

iuffic^entlf tr'f T'^'" ""f^^"- °' that ther^ is 

sufficiently trained personnel available at the proposal preparation phase 
of the application process (at the LEA and SM level), in the evenJ tSt 
nha«e . ^^^^f 1^ drained personnel, an evaluation negotla^LJ 

tihnl^'i'^ ? fP'"- the grantee could obtain federally « nan,n.^ 

m^l^rin/ol th'"' S a^iuation plan is negotiated, "^cont lSSr 

monitoring of the evaluati on process should be carried out for the durat ion 
of the contract. This is , especially necessary when an LEA has to rely on 
outside contractors and/or when local evaluation capabilities are fess 
than sufficient. As the application process is currently structuredrthe 
SM IS supposed to receive a copy of the proposal prior to the tl^e the 
proposal is submitted. The regulations do not. however, sp^ify who is 
to conduct the review and It seems reasonable to state LcpLcitL tLt 
an evaluation specialist and program speciali st stouir^gfeldr c^STF. 
on tne met no do log leal and e d ucational qua lity of the proposal. C.r- 
^^^TT'Vi'^ personnel Lggest that the review is usually 

bv Ste°?^ ( substantive experts and routine examination of proposals'^ 
by state level evaluation personnel Is the exception rather than the rule 
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5,3 CRITIGM. EVALUATION ISSUES^ 

During the short history of evaluation reaearch, critical iisues 
that threatan the quality and integrity of evaluation efforts have become 
apparent- This action considers four topics bearing on the valuation 
of new programs and component a of programs. 

The first section considars implonentation of programa, the time it 
takes to put prototypes into the field. The second and third address 
design Issues especially the use of randomized field experiments to plan 
and evaluate new programs. The last section concerns criticism and re- 
analysis of the results of evaluation, primarily at the national level. 



Program toplementation 

Difficulties in Implementation are Chronic . It is not difficult to 
identify instances of notable discrepancy between program plan and pro- 
gram activity. Early Title I funds were used to supplant, rather than 
to supplement, exiatijig funds as the law requires. At Immt some suimer 
compensatory educatiotml programs did not receive funds in ttoe to spend 
them correctly during the late 1960's, More r^ently, the Study of the 
toergency School Assistance Act suggests that analogous discrepancy be- 
tween statutory requirements and local assignment of funds characterized 
a few school districts' operating programs under the ESAA, Kiat it is 
not only fiscal asp^ta of a program which may be problanatic but also 
(more often perhaps) staff level and staff activity, is clear from case 
studies of Performance Contracting, Follow Through, Planned Variations 
Middle Start, and others. * 

That the problem merges in smalls scale efforts as well as in the 
large Is wident from r^ent work at Ohio State on High-school Internship 
Programs, evaluation of media -based instruction such as Sesame Street , 
studies of variability in home-based Instructional progrms^ and others. 
The persistence of such problems suggests that in tests of new programs 
both control and program groups be routinely monitored for dilution of 
progrfflns since estimates of program effects under these conditions will 
not be accurate. For mssive comply programs, such as Title I, high 
quality sample surveys such as the Sustaining Effects Study have been 
informative. But we have uncovered no formi federal policy which 
would provide for periodic sample information of the same or higher 
quality. 

The Difficulties of ImplTOentation are not Confined to Education ^ 
Problems in assuring adherence to a program plan and in assaying the 
nature of adherence are not confined to educational innovation, of course. 
In early experimTOts on reducing retrolental fibroplasia, nurses were 
often unwilling to cooperate with researchera in depriving pr^^ture in- 
fants of a highly enriched oxygen envlronmenc, then considered beneficial. 
Studies later draonstrated that infant blindness is caused by the routine 
hlgh-oxyg«i trMtment, Malogous difficulties have bem encountOTed to 
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oth« areas of social «per Mentation - a^tamples from englnearlnR bio- 
chemical resaarch. and pharmaceutical control are not difficult if 'f^^d. 

of n '^^ °^ Pr oblem A pp ears to be Time for laplment atlon . Estimates 

available Information exists more often in the form of acpert ludament 

Educati J'S'"^-'" McDanlels of the lirSu of 

Education for the Handicapped, for instance, regards one-y^r to^Lenta 
tion paiods for new programs as absurdly inadequate and LintaTSs^St 
some sponsors of Follow Thrnugh models raquired'two to th^e Je^rs as^an 
absolute minimum for Implementation of new programs lifted dl^lctlv from 
the laboratory. According to NIE-s I^ls=ellin Datta/t"f tlmrallLated 

."'r"f ^""^ "^''"^ operation of Planned Variations !! two years - 

d%IpLe"he'fact't^rH^1/r ^"^'"^ "^"^^^ Timpane observed that 
of A I High/scope sponsors were the most sofhistlcated 

of the Planned Variation sponsors, five years of developmeL In the actual 
Ibe SvLT ^f'"" before they could say the program Js ful.ly SevelopS 
The System Development Corporation's John Coulson puts thp time at thref 

d stricrad^o i"'"? f"^ P'^^"""^ ^'"^'^ case'studles oHchoof 
aistrlct adoption of innovations suggest uhat Implementation generallv 
takes more time than expected. "Virtually all" were in theL second Jear 
of involvement with the project and the first year of impr^LtaSon (" 119) , 

in the*n^hn«hTiS^* "'^ ~" P^^i«« assessments than these 

f^ri./ S literature, and believe that the absence of coherent in- 

formation on the topic is a serious problem. conerent m 

UCLA ^"f t^f" Implmentation at the Local Level , According to the 

^ "udy of large school districts, only about 20% of the dilectora 
or evaluation rank ImplMientatlon assessment as one of three most time 
consuming efforts. This stands in contrast, for example, to the 70% 
wno regard assessing results of programs as very time'^consuming or the 
66% who spend a great deal of time on measuring student objectives Their 
r^^-.f "^'^ "P°"^ ^^^"'^ district evaiuatSn units 

alKh molt f ' f " - °' '"^'^ ^'^^•^ program lmpl«„en?atLn -= 

although most reports do concern a particular program. 

w„ P^ftotog and Mea suring I m p lementation at the Local Level , in evalua- 
ting tne National Diffusion Network. John anrick and his colleagues^t 
American Institutes for Research id«^tlfiM some severe prlJlSf S 
understating wtat "adoption of an Innovation" means. The isSe is critical 

and L^hlulfb r '^^'T "^'«d innovations by Ssf 

and it should be evaluated partly on the basis of achieving the eLi of 
adoption The adoption is related to the issue of to^ementatLn His 
S'levl or h'°" °' scale of adoption : ha.^";. 

men^H S< r ^°'nP°"«"ts Of the original innovation are inple- 

d?"ler J f °- " "hat «««nt is the innovation mu?h 

different from what went on before, (c) fidelity , or degree to which the 
■^plemented version of an innovation matches the originfl innovation 
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Ae a practical matter, AIR defines scale of adoption in several ways^ in- 
cluding the proportion of people who admit implamentlng only part of 
the complete package rather than the whole innovation. In assessing 
fidelity^ AIR suggests using modification rate — the proportion of people 
whu say the innovation must be changed in reasonable ways to be compatible 
with the system* 

Definition and Measurement at the Natinnal Level . Eatablishlng whether 
a program Is implement^, whether it has si ibilized, and what its level of 
implementation is relative to a standard Is a legitimate matter for eval- 
uation policy. It is not essential that the pertinent questions be an-^ 
swered precisely at the oversight leveli though there may be considerable 
need for thorough measurement in the project-level evaluation. In point 
of fact J the GAO makes assessment of implemTOtation explicit in its own 
policy on oversight ^ and this continues the agency's durable tradition of 
verifying that programs exist or fall to exist. The need to treat imple- 
mentatlDn as a policy issue has been reiterated by acadmic advisors to 
government J such as Peter Rossi and Howard Freemnp and agency executives 
such as Michael Timpane, 

Microassessment may also direct attention to the stage of a program's 
development. It Is clear from Follow Through^ for instance, that direct 
transferral of laboratory programs to the field is likely to engender major 
iraplemtatlon problCTis, It could also include the attention to expenditures, 
manpower assignment, samples of transactions and delivery of tangible goods^ 
which typifies the administrative audit. University of Washington's Richard 
Elmore summarizes the vlaw^ arguing that it is sensible to measure **dtoension 
of classroom activity before a program's introduction as well as after,., 
to identify major characters In the process and their roles, and enumer- 
ation in detail of wl^^ is to be d^plemented 

In large-scale evaluative surveys of e^Klsting programs, the analyst's 
Interest lies in measuranent of implOTentatlon rather than actual control. 
Experts such as Spady* conclude from school resource research that "crude 
and deceptive measures of supposedly relevant resource variables'' are a 
distinctive problem, and that "the school resources tapped (measured) In 
a majority of studies are crudely or even unreliably measured, stress 
quantity or mere presence over quality or mode and degree of utlllEation 
and are rrally only proxies for resources which actually rMch children," 
The Pett Igrew/ Green and Col^ian debate about the large-scale data used 
in analyses of white flight following school desegregation suggests that 
the indicators of desegregation are not always well understood ^ despite 
their ostensible pertinence to assessing Impact, nor are they uniformly 
reliable, 

Microassessment of Program Impl Mentation , Despite the educational 
researcher's conscientious attention to mMsuring children's responses 
to programs, such as achievement, and despite the manager's equally con*^- 
scientlous attention to programs' operational problems^ the notion of 
developing and using methods of measurement to gauge ImplOTentation Is 
relatively nev^. Wit tout such methods, it is unlikely that future innovators 
will be able to avoid the less visible sins of their precedessors, to de* 
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fSLr^ f "^'^ ^""^ d° °^ to construct niore crrfibl 

esttoatas of program effect. At bast, that quantitative indicators of 

TionfLd tlTJ^f T ^^«Pl°i^«d t° design .or. senslU^e avalua- 
catholic 1 "^^'^^'^ '° ^^^^°P indicators must be 

are °iso'ci^f ""^^ theorists, and maasuramant specialists. 



tw. t P°^s"l«. w"h sufficient financial support to be more 

cmrfnt«ac^?on"if'f ' P'"^"'"' " "'"^^ ^-"^ "'--^ on l^^. 
She" EvI?ltion 'of Stanford, T^as, and else- 

wnere. Evaluations of viewing time and other factors affectinE recalnt 
Ffellr^f television programs appear in Sesame Strfet! aLtrIc Lpa. 
better' i^ln ^ ^m^- " Is not unreaii^eTTelp^rlfsl 
r^.l^ -° A ? territories traditionally claimed by economists John 

olmenrCo^po'rat'Sn'h" Evaluations suggest that Systi;s Devel- 

opment Corporation has made notable advances identifylne the nwinl ^ «h« 
are informed about budgets within school districts L flic itln^'clJr^ 
information and in idmtlfylng subtle stereotypical fUws In thi pro^Ls 
Critics such as Weinberg assert that even the Coulson work could be to 
proved considerably by focusing on expenditures rathL thL on LgatT 



Pleas L'r"a'ttent?^%"'J.'°'^'";r'"'' " desperate, is r^rkable. 
i-ieas for attention to the problem have been made by policy analysts In 
reviews of planned variations, such as Elmore at Wa Ling to J by federal 
and BL«"ft1i\1? " -^'T " developer! such Is WelLrt 

ScJen^rLearch'c^un:??.'"' ''"'^^'^ ^"^^ Social 

An implication of all this Is that in oversight policy programs 
ought to be identified by more than pious promise rhetorical labS n. 

™er"f ^1 «P«"i"lly/be?ter inio^mltJof ^n iemSra ' 

character of implementation. stereGtyplcal misdirection, and other crude 
dimensions of implementation ought to be collected systkatically! W^at 
Is less clear is how the job can he done effectively. Options under 

Eff ects's udv 'f^f { ^T"'"' ^^"P^" ^"^^^ - ^he L^^f Sing 

Follow Th^o^H special investigations such as the GAO's examination of 
Follow Through, and uniform reporting which avoids or at least r^ognlzes 
mlsreporting. a„rlck and others recommend that "criteria for assefsinf 
documenting and reporting the status and quality of adoptions be developed" 
thl Jd ? "'T P^S'^^? discussing the National Diffusion Network, ^ut^ 
the advice is sensible for at least some new programs. 

essential ° " research on methods of assaying Impltoentation is 
essential if we are to understand (1) how to construct less exnensive 
mettods of observation (the «.lstlng ones are often expensiv^r al 
what impact the observational processes have on implementation (eg ' 
they may foster implementation in some cases or the Illusion of impl a- 

T^^^^^'i -^'J "^'''^ observable dimensions, including quality 
of student-tttchar interactions rather than just frequency ): 
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Evaluation D esign 

By evaluation design h^e, we mean a plan for assigning individuals 
to programs or to program variations, coLlectlng dataj and analyzing re- 
sults so as to produce reasonably accurate esttoate of the relative effects 
of programs. This stress does not mean that other types of evaluation are 
any less important in a particular case. Questions of how many students 
are served and what services they receive appear to be no less taportant 
in the Congress's view. 

Quality of Deaign , The quality of an evaluation design for estimating 
a program's effects will Jjiflumce the q^llty of evidence about those 
effects. To the extent that state-of ^the--art design Is ignored ^ evaluative 
data on program effects will be ambiguous at best and misleading at worst. 

The idea that evaluations ought to be designed and begim before a new 
project's installation was novel to most funding agencies 15 years ago, and 
It Is still a novelty for wost private fotjndatlons* Rather j the OTphaais 
was on after-the-fact analysis—post mortOT or opinion survey — ^wlth Its 
attendant shortcomings. It is r^arkable that attmtion to evaluation design 
has bem focused in the Intervening years to the point where policy raphasls 
on design is evident at the Assistant Secretary level to DREW Judging from 
Henry Aaron's testimony before the ^mmittee on Human Resources* This does 
not mean that most designs CTiployed in recmt studies have bem perfect. It 
does mean that the poor quality of earlier evaluations Is better docmnented^ 
the design-related reasons for poor quality better understood , and there la 
sc?ma political pressure to eliminate or control Jnept designs* 

The justification for attending to evalTiation design stms partly from 
reviews of ©iucatlonal program evaluations generally and comnentaries about 
inept evaluations at the school district level (e*g*5 Hawteidgej Ghalupsky, 
and Roberts on Title I in 1968) , Others reiterate the lesson based on 
case studies of selected educational program evaluations and medical projects 
(e*g, 5 by Gilberts Light, 6i Mosteller at Harvard)* In extended reviews 
of single projects such as Head Start by Rlvlln and Tlmpane, for exMiple, 
they observe that ^'researchers foimd out, late and the Mrd way^ about the 
costs that design flaws and compromises, , *^act in the form of weak or 
uninterp ret able results" (p, 11), More rudlmOTtary design problms helped 
to make data from Performance Contracting evaluations nearly useless for 
estimating progrOT effects, judging from Gramllch and Koshel's analysis at 
BrooklngS"t hough this evaluation was notably Informative in other respects. 
Broader critiqueSj such as Bernstein and FreOTan's make the same potat. 
Such studies are thanselves not flawless (see Abt & Abtj and Wortman, for 
example) , nor do they accord much publicity to well^designed evaluations 
which, though not in the majority, are no less taportant. They are clearly 
a useful vehicle for understanding that some evaluations have bem Inadequate 
and for understanding why they were Inadequate* 
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The problem of assuring good evaluation design is not confined to the 
education sector The GAO's early criticism of evaluation of the Experi 
mental Housing Allowance Program was based partly on GAO's judgment that 
Che design was flawed, m medical research. Chalmers, Block, and Lee 
provide clever evidence, comparing results of randomized experiments against 
those of nonrandomized tests, to make a similar point. Silverman's delcrlp^ 
tion of 20 years work on blindness among premature infants, which appeared 
in gcientific American, is an especially nice model from medical research 
It Identifies the ways in which poorly designed tests were both helpful and 
dangerously misleading in early work, and the later resolution, usiSg^ 
"rtl^lS^^f °f =°"t"versy over the causes of bllndnlss. 

Partly because education, unlike medicine, is a rather recent ijiterest for 
the science writer or historian, long-term case studies of the consequences 
of poor- design decisions in educational, research are more difficult to find . 

desi»n"^hiuTf implication is that attention to quality of an evaluation 
design should be incorporated routinely into policy on program development 
and evaluation This is not to imply all evaluations must be of high quality, 
It IS to say that choices of toleratJug minimal evaluation or to support a 
high-quality one ought to be recognized as such. 

,n.r^'^f^ "fn '-^^^ ^ programs should not be designed and put into the field 
apart from their evaluation is explicit in reports of the GAO on specific 
projects such as Follow Through and in more general guidelines on oversight 

and into IvJl f°" 1'° '^'"'f attention to quality into Congressional review 
and into evaluation law is also reflected in legislative staff activity, 
for^example. early guidelines for Congress developed by Foskett and Fox' 
thP n sfii^^^s. and Franklin Zweig's more recent efforts to assist 

the U.S. Senate s Committee on Human Resources. Within agencies, it is 
ditficult to distinguish between individual professional preference and 
general agency policy. For example. NIE's Lois-ellln Datta stresses the 
roles played by former NIE directors Thomas Glennan and Harold Hodgklnson 
m supporting rigorous assessments of NIE's Career Intern Programs At 
least one divisional manager at NIE has been consistent in encouraging high- 
some ' l""'r experience programs despite early resisfance along 
some contractors to the policy. At the level of the Office of Program" 
!Jr^n^"- f ^^^^"^tion in the Office of Education, some directors have been 
strenuously arguing for quality in design, basing their advocacy partly on 
the controversy over outcomes of the Head Start evaluations (Evans. 1974. 
U.S. Senate Subcommittee on Oversight Procedures. 1976). At the Assistant 

?n%uaStv oFe^f'^" ^^'^^f ^^^^ generally the lack of uniformity 

in quality of evaluations supported by DHEW, suggesting that they run from 
bad to excellent and that the outcome depends heavily on thought invested 
ac cne design staga* 

^o . Pf^=y"f«=h emphasizes quality in evaluation design cannot be expected 
to Implement itself or be adopted immediately. It is for this reason that 
impravement m caliber of staff responsible for review of design is crucial. 
iL^r'f A "T" according to Aaron, been vigorous at the 

level of Assistant Secretary for Program Evaluation in DHEW. Judging from 
the conversations with a few members of that staff, the effort has had 



IS] 



5-22 



remarkable results* However , we tmve bem able to uncovar no serious 
acadeolc attantion to staff improVCTent in this gector or In other federal 
agencies s such as the GAO, where staff inprovanent has had high priority, 

^jodgwlzed Field Experiments 

Randomized expertaents^ In which children or classrooma ^ schools, 
are randomly assigned to one of two or more alternative progr^s which 
have the same B±ai^ are a promising vehicle for qbtalsing a fair estimate 
of the relative effectiveness of the programs* The usefulness of randomized 
tests ^1 p^jnciple Is gmerally not at Issue* When eHperiments are con^ 
ducted properly, orthodox theo^ of experimentation guarantees that long'- 
Tan estimates of effects will be unbiased* The cost of this tacreased 
assurance of Ijiterpretable results is a greater dOTand on managers of 
progrmas. The dCTiand Includes executing and tCTporarily maljitalnlng the 
randomized assigtment and other necessary cooperation with the evaluator* 

^gument about the use of the design more frequently concerns the idea 
that randomised expertaents are rarely feasible in field settings* "Rarraess" 
and "f easibllltys" howevets are tofrequently specified by the government 
policy group such as the Nil's Task Force on Resources Planning in 197^^ 
or by the individual analysts such as Horst^ TallmadgSi S Wood* It is true 
that, although the design Is not new, its application in evaluating educational 
and other social programs is relatively novel* But novelty does not establish 
lack of feasibilltys and a ^notable, if not large, nianber of field experiJients 
have been mounted* 

The moat recmt examples include evaluations of: parts of the Emergmcy 
School Md Act (by Coulson) , a subset of career education programs (NIE; Datta)* 
Middle Start program run at Oberlto College by Yinger, Ikeda, Laycock, ^ Cutler, 
educational TV programs in health developed by Ghlldrm's Television Workshop 
(Mielke & Swlneharti Minor & Bradburn) preschool education (Bogatz § Ball), 
primary education (Ball & Bogatz), radlo^based mathCTatics tastructlon 
(Searlej Friend, & Suppes), and even grade retention (Jackson), Oliver 
Moles at NIE has managed to implement randomized tests of programs that were 
designed to reduce dlstruptlve school behavior* Welch and Walberg at Illlnoia 
are mong the aabarrassingly few researchers in any disciplJjie to have mounted 
randomized experiments to test a dlssralnation/utlll^ation strategy* (George 
Falrweather and Louis Tomatzky have done rCTarkable work In mental health) * 
Rickel, Smith, and Sharp of Wayne State University have OTecuted reasarkable 
rigorous tests to establish the effectiveness of a prevaitlve health care for 
preschoolers with behavioral and emotional problems served with Title I 
program funds In the Detroit School District, The Call, Colombia, tests 
on education and health programs for malnourished and educationally deprived 
children are a milestone in the developing countries (McKay i Stolsterrai 
McKays Gomeg, & Uoreda) * Partly to capitalize on problems and hard lessons 
in the original planned variations study, USOE Issued an award--wlnntag BF9 
111 1976 for higher quality evaluation of few planned variations progrOTS for 
youth* The project was terminated by USOl for reasons lying In the program 
developing arena as well as in the design arena. Most such field teats have 
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been moderate in size, :toergency Schogl Aid Act (ESAA) being an exception. 
Others have not been successfully executed even if they were well designed 
by technical standards, as one should expect. Boruch, McSweeny, and Soder^ 
Strom's bibliography of attCTipts to mount randomized tests Jiicludes both 
successfully executed ones as well as disasters • 

The state of the art in executing field expertaents is developing 
rapidly. One of the clear lessons of the past 10 years is that solutions 
to managerial, institutional, and other problens are at least as crucial 
as the statistical and analytic ones* The incentives for axperimantat ion 
have also been better articulated, partly on the basis of case studies of 
nonrandomized evaluations. This includes the arguments In both acadanic 
and policy quarters that the original Head Start evaluations inadvertently 
produced biased estmates of program effects (Campbell ^ Erlebacher; Evans). 
And it Includes observations by Smith and Datta that the results of some 
quasi-experiments, such as Head Start Planned Variations, are unlikely to 
provide clear evidence about the models* relative effects even if they are 
informative on managerial grounds. The special problans engendered by the 
need for a strict regimen in assigning individuals to programs and the con- 
ditions under which randomised trials are feasible, are better understood as 
a result of case studies on evaluations in education and other areas (Riecken 
and others). The legal status of random assignment rules Has, until recently, 
not been clear. Three judicial decisions on the legitimacy of randomization ' 
in cestlng programs are reviewed by Breger; those decisions and the undarlylng 
theory are generally favorable for the judicious experimmter. 

The Implication of all this for policy Is that randomised field tests 
should be regarded as a legitimate option, if not always the preferred one, 
for testing new social programs when estljnates of relative program effect 
are important mough to justify their cost. Consistently well^executed 
field experiments in education are likely to require greater changes in 
legislative posture than are those for medical research. As long as political 
feasibility, rather than interest in impact, is the primary rationale de= 
termining evaluation design, demonstration projects, rather than experimantal 
tests, are likely to ranain the norm. 

guallty of Analysis 

One of the main results of surveys and interviews with Congressional 
staff m^bers is that they are concerned about credibility of the findings 
of evaluation at the national level, and to a lesser extent at the local 
level. The matter is a red herring in one respect and a legit mate issue 
in others. 

The specious aspect concerns assaults on evidence whm it turns out 
not to be favorable to one's posit ioii. This is especially true of evalu« 
ations which are dedicated to estimating program effects/ It has clearly 
occurred In recoit controversy over evaluation of bilingual education programs; 
It occurred as well in early Title I evaluations. Simply put, not all criti- 
cism is competent. The legitimate feature of the problem is that evidence 
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is often incomplete and its quality uncertain. To the ^tent that the 
state of the art in analytic methods is unsettled, thm soma argmaent 
about the proper analysis and proper toferences is warranted* For feasible 
evaluation designs there is considerable room for debate over proper an-» 
alysis and conclusions. This is as chronic a problem In medicina as it is 
in education. 

Critique and Secondary Analyais 

toe of the major JinpliGations of contemporary eKperience is that 
major progrm evaluations should be subjected to formal critique andj where 
necessary, reanalysis of original data. By critique here we mean balanced 
and independent appraisal of the evaluation not just negative criticism. In 
particular^ the results should be reeKammed by Individuals tadependent of 
the original mvestigatore, "Findings*' here include data, analysis, con- 
clusions, and recoimiendat ions. The taplication holds for any type of evalu- 
ation used as a basis for policy. 

The reasons for this policy and the foms it may take have become 
clearer over the past five years, partly because of actual reanalyses of 
data stCTmtog from evaluations of Sesame Street , of Head Start, the Equality 
of Educational Opportunity Surveys, Follow" Through, and others. The most 
recent illustration stras from a multi-year Rand invest igat ion of federal 
programs supporting educational change* toe part of the report criticised 
technical assistance rendered by consultants to a school district on program 
development. This aspect of the report was widely publlclE^ and widely 
believed in and out of government* We are aware of only one critique of 
the study, by Lois-'ellin Datta of NIE. It shows nicely how the data 
presented in the report do not support such a sirapla conclusion* 

Regardless of whether Rand's conclusions are justified, the point is 
that no serious formal criticism of the report was undertake before or 
during its release. There are several important * reasons for considering 
formal critique or reanalysis, regardless of the skill and integrity of 
the original investigators. If one espouses the view that most effective 
internal evaluators should be benign skeptics, the need for axternal 
analysts who are benign in different ways, if not less benign, is justified 
on scientific and political grounds* To the extent that Independ^t sacon- 
dary analysis is routine ana visible, it may impede the corrosion of 
credibility of larger-scale program evaluations of the sort described by 
McLaughlin m early Title I progrms* The more durable reasons for secon- 
dary analysis are scientific i Verifying quality of information and 
identifying both egregious and sophlsL icated errors in analysis are basic 
to that interest* That egregious errors are sometiraea not hard to find is 
evident from GAO's reexamination of early Follow-Through evaluations for 
Instance* Secondary analysis is an economy-minded notion in the amse that 
an expensive data set is made to work repeatedly at low cost^ if we Judge 
by the HMd Start experlmce and Lois-ellln Datta *s examination of the 
reanalyses, -regardless of its quality. 
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This is not to imply that all reanalyses of raw statistical data will ' 
be worthwhile, since the worst of cases can be identified easily in critique 
by external reviewers. Nor will all reanalyses produce new findings^ Indeed - 
merely certifying that original estimates of proiram effect are reasonable 
may be sufficient. Finally, reanalysis may be no more illuminating than the 
original analysis. This is still useful in the simplest sense- Two ambiguous 
interpretations may help to verify that things are indeed as bad as they sem 
to be. " 

The view that reanalysis of raw data should be regarde'a as a legltiaiate 
option m overseeing evaluations has been adopted in GAO's draft monograph 
on appraising Impact evaluations. Within some divisions of agencies such 
as NIE, for instance, de facto policy of reanalysis exists. We judge this 
from conscientious maintenance of data generated by evaluations of the 
anergency School Assistance Act, career education programs, and others. 
Some exploration of the topic as a matter of internal policy has been 
undertaken at the federal level by Virginia Koehler, occasionally at the 
state level judging from Powell at Colorado, and by academic analysts. 
Nil appears to be the only agency until recently to have made a substantial 
fiscal investment in supporting research on reanalysis of program evaluation 
data. But the general policy of supporting secondary analysis has not been 
officially sanctioned nor explicit at NIE, USOE, or other agencies, such as 
the Agency for International Development, which evaluate new educational 
programs. 

Data stemming from surveys and evaluation activities by Parent -Teacher 
Associations, community groups, and the like are rarely subjected to any 
real reanalysis despite their import for local policy and occasional import 
for national policy. But the Justifications given for reanalyses of federally 
supported evaluations appear no less pertinent to this arena. It is not 
unreasonable to expect that federal policy can serve as a model in this 
area for state agencies and legitimate interest groups at the local level. 
To the extent that policy on evaluation and on support of these groups' 
applied research also stresses secondary analysis, the quality of the 
product is likely to improve. 

fitorage of and Acgess to Information for Secondary Analysis 

The Freedom of Information Act guarantees access to a variety of in- 
formation useful for evaluation at the federal level, and some state laws 
provide similar assurance at that level. As principle, this can help to 
justify access. Despite the law. It has been difficult at times to acquire 
data used for program evaluations. The difficulty is certainly not con- 
fined to education, judging from the illustrations in crime deterrence, 
biochemistry, and medicine. Thn difficulty is influenced by flawed program 
management, poor record keeping practices, professional jealousies, and 
the bald fear that an analysis will be found incompetent. Formal policy 
cannot be sensitive to each problem, but it can be used to reiterate the 
message carried by law and to routinize the opportunity for conscientious 
reanalysis. The only written statement of internal policy we have been 
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able to locate has been developed by the Uw Enforcement Assistance Admin- 
istration (LEAA). The policy hinges on the requirement that statistical 
data generated in LEM-supported research be turned over to a special LEAA- 
supported archive at the University of Mlchlian whm research reports are 
submitted. It is consistait in spirit with preliminary papers developed 
at NIE and with general policy developed by the President's Federal Statis- 
tical System Project, 

Adoption of the idea that data ought to be made available for re- 
analysis creates choices about whether to support decentralized archives 
whether to screen data seta for their political or scientific value, and 
so on. The National Archives have made explicit their Interest in storing 
all machine-readable evaluative data, and this will help to attenuate 
management problems in the policy's implemaitation. Vehicles other than 
the National Archives are worth considering since no quality c'SntJJl systan 
for screening of data or documentation will be undertaken by the ^chives. 

Privacy an d Second_ary Analysis . Whai secondary analysis depends solely 
on statistical data from which Individual identifiers have been removed 
there are usually no special privacy problans engendered by the data's 
disclosure, at least for individuals. A class of problmi cases, in which 
deductive disclosure of information on Individuals is possible. Is so small 
and so idiosyncratic for major studies that, as a matter of policy, it can 
be accommodated by ad hoe review of the special cases. 

The more likely problem, JudgJiig from recent arguments over access to 
^ I^ravs Board of Education of the City of New York is that (a) claims 
of the risk of deductive disclosure will be madTe, (b) no clear standard and 
no formal explication of the evidence for the claim Is published and (c) 
the claUn will be entrainai In other, more important reasons for refusal 
to disclose (e.g., protacting institutions, such as schools, against charges 
that they have violated individual civil rights). 

For secondary analyslB of Identifiable records, the privacy problm 
is rather more crucial. It has been subjected to Investigation by over- 
sight agencies such as the GAO and advisory groups such as the Social 
Science Research Council, and legislatively mandated review bodies such as 
the Privacy Protection Study Commission. The first of two major conclusions 
stemming from this work Is that there is a large variety of procedural 
methods for linking data from different record systmis and for checking 
the quality of records, without abridging privacy rules governing each 
source of Information. Those methods have been tested In education, epi- 
demiology, and other applied research. They should be considered by 
independent researchers and to oversight agencies such as the GAO where rules 
limit access to identifiable records. Second, where identifiable records 
are necessary for legitimate research, disclosure of those records to the 
researcher should be permitted provided that stringent conditions be met. 
Those conditions Include conscientious review of the process, clear agreement 
of the research function and nature of disclosure, and prohibitions and 
sanctions against violations of agreonent and r^isclosure. 
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5.4 CONSTRMNTS ON THE QUALITY OF EVALUATION 

Tactmical factors are not the only influeice on the quality of 
evaluations, Threa other factors are considered here: Thm appropriate- 
nass of limiting communoation between agency and Congressional staff; 
Federal Clearance RequiramTOts; and Independence, Inteirlty and candor in 
reporting. 



Direct Contacts Between Congressional Staff and Agency Staff 

Over the last five years, memoranda have been Issued by the Office of 
the Secretary of mm have periodically concerning contacts betwem Congress 
or the Congressional staff and federal agracy staff. The theme has generally 
been that such contacts the abs^ce of an agency legislative liaison staff 
are not desirable. There have been official demands that liaison accompanv 
any staff, r j 

The justification for restrictions on contact centers around the 
probable difficulties eng^dered by numerous agency staffers dealtog with 
the Congress directly, Posltiona of the Department may be misrepresOTted 
or misinterpreted. The difficulty is not confined to Education of course. 
We recognise that thtte is a need to avoid program initiated "lobbying" but 
take Issue with the uniform application of this policy as it is applied to 
the process of conductljig evaluations. Producing a timely, high quality 
and potentially useful study, requires unimpeded access to relevant Con- 
gressional staff to ensure that Issues are properly addressed before and 
during the execution of the study. While restricted contact may be approp- 
riate for some Departmratal Issues, it is counterproductive when evaluationa 
are concerned. 

The major side effect of this institutional policy appears to have 
been impeding the ag^cy^s ability to understand diverse Congressional 
interests In evaluation, to topede development of reasonable basis for 
Congressional staff views of agmcy evaluation activities, to impede the 
negotiations which are part and parcel of may major evaluation enterprise, 
Withte cm, th^e senttaents have hmmi expressed by tw of the three 
divisional directors to whom we spoke. And the public rmarks by teowledgeable 
Congressional staff reflect the same spirit, John Jmnitags, for Instance, has 
potated out flaws In OW activity • misunderstanding the law and Its require- 
mmtm^ creating m evali^tlon plan whm the evaluation Itself has no chance 
of betog used by the Congress, and others. Each of these could be avoided 
with reasonably unrestricted conversation. 

The result of formal restrictions is that agency staff have not felt 
free to call Congressional staff to ask questions, to verify their own 
interpretations, to clarify problems ixi what to evaluate and how to evaluate 
it. This is compounded by inability or unwillingness of some Congressional 
staff to taltiate a conversation with the agency staff* 
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We believe it is a sufficiently complicated and serious problem to 
warrant at tent ion from Congress and the Department of Education. Major 
restrictions on conversation about evaluations which the Congress demands 
and the Department must provide are abiurd when the utility of the evalua 
depends so heavily on mutual understanding. 



Proscription agatost disclosure of information collected during an 
evaluation without fonnal clearMce can be ^d has bera an impediment to 
evaluation- It cto for Instance , truncatei the govermmt -s access to manpower 
for evaluation* In the worst casej it may lead to delays in providing In^ 
formation. 

To illustrate, eonaider directives Isaued by the Secretary of Health, 
Education, and Welfare in 1978. Until thM, evaluation reporta had bera 
Issued routinely by the Of flea of Evaluation and DlssTOlnatlon to authorize 
atlon and appropriations committee m^berSp and relevant filemfaers of Congress 
and the ag^cy* Early in 1978 , Secretary C^llfano evidmtly became *'aware 
that procedures for transmitting reports needed to be more clearly deflnedj" 
and made his recommendations wpllclt Ia an April 10, 1978 mrao, Th^ 
Ismediate cause for coneen appears to have been queries to the Secretary on 
information released earlier* The action taken in a m«no from Calif ano 
to DHEW office heads was to require that all reports and evaluations be 
reviewed by the ^ecutive Secretariate prior to release* "nie single m^^ 
ceptlon noted Involved reports specifically mandated by Congress to go to 
Congress directly* ^e requirement was put iiico effect for' final reports. 

The memo also appears to have been instrummtal In strength&id^g a 
requirement that contractors abide by the same rules. In particular, 
^ticle 28 became part of the boiler-»plate for evaluations eKecuted with 
USOE funds* The article requires that the contractor not dissaaJjiate data 
without \^ltten consent of the contracting officer* 

The general requlrOTent for clearance by the Executive Secretariate 
led to notable delays to release of reports produced by the Office of 
Evaluation and DissCTlnatlon* Judging tVGm the 13 reports submitted for 
review between January 1979 and ^iay 1979^ the delays range from 18 days to 
133 daysi a quarter of the reports were delay^ by over two months* Moreover, 
only one report of this collection was returned to the originating division 
for revision. According to a mCTiorandiim from Jin Pickman of OED to Rick 
Cotton of the Secretary's office, the rMainder received no notable modl^ 
float Ion as a result of r^iew by the Secretariate, 

At lemst one major effort was undertaken by OED executive staff to 
assure that review by the Office of the Secretary had an opportunity to 
review reports within a specific ttoe* It did not succeed* The current 
Office of the Deputy Assistant Secretary of Evaluation and Program Manage- 
m^t, has however^ instituted a new clearance system and established a clear 
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lunxt of 10 days on release of reports. In particular, the offices of the 
^..'cretary, Under Secretary, Assistant Secretaries, and Deputy Assistant 
Secrecaries have "ten working days to review the sunmary and, if they wish, 
the contractor's report," in accord with a directive from the Deputy Assistant 
Secretary for Evaluation and Program Management, 

Article 28. This clause p. events a contractor from discussing data 
produced xn the execution of the study without permission of the contracting 
officer. Interpreted by at least some lawyers, this means that information 
generated during an evaluation cannot be presented, for example, at professional 
meetings prior to agency review. For the uninformed, it implies that results 
cannot be discussfed without prior permission. We have been told by executives 
ot two major contracting organizations that their staff members merely send 
a copy of the paper they plan to present at a professional meeting to their 
project monitor (not to the business office of course) and routinely receive 
permission to present results. The basis for this agreement appears to be 
the reasonable expectation that professional presentation will help illuminate 
strengths and weaknesses of the information. Often the routine approval is 
provided because of trust in the capability and integrity of a contractor 
selected in a competitive process. Apparently, the article affects some 
academic institutions in the same way. 

However, some well regarded academic institutions do not treat the re- 
striction on information as informally. As illustrated in the case study 
included here, Northwestern University refused to agree to the pertinent 
article in contract negotiations for research leading to this report. The 
National Academy of Science's legal counsel raised a similar issue in negoti- 
ations for a parallel Committee on Program Evaluation. The argument to many 
IS an unnecessary one, given that no funds for evaluation in education have 
ever been withdrawn for violating the article. But it is sufficiently real In 
principle to warrant argument. 

Independence and Integrity 

Federal Apencies. The question of whether the agency responsible for 
reviewing a program should also be responsible for evaluation is an old one. 
It IS implicit in the early history of the U.S. General Accounting Office 
and more recent history of the Office of Evaluation and IDissemination at USOE. 
The arguments for and against the approach can be put tersely. 

The advocates of conducting evaluation Independent of program staff 
argue that evaluation reports are more likely to be candid and fair if they 
are conducted by individuals outside the jurisdiction of the program On 
the other hand, advocates of evaluations being carried out by program staff 
rather than by independent evaluators, maintain that program staff are more 
knowledgeable, better informed, and that evaluations can be more useful when 
dona by program staff, 
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Cast Studies of tmpedijnents 
to Timely Research: 
Aiticle 23 

Public Law 95-561 re-^ulred thar. an evaluation of educational program 
evaluations be undertaken during 197 9-80 and that a report be submitted to 
Congress by the Commiaslor.er in July 1980, The law further required that 
the Coiriinissioner of the U*S. Office of Education make available funds to 
an in d e^an d en t agency to undertake the study. Independenca was Mphasized 
in the USQE's wo^rk statMent to assure fair examination, 

John Evans of the USOE-s Office of Evaluation and DissCTination (OED) 
invited proposals for the research from several independant rasearchers. 
On July 5, Boruch discussed the matter with Evans and John Jonas of Rep- 
resentative Holtzman's office* A formal proposal was submitted on August 
31* The discussions with Evans and Jonas repeatedly stressed independence 
of the investigator* 

Northwestern University was provided with a copy of the proposed con-- 
tract on September 25 by the USOE-s contracts Office* Officials at North- 
western read the business action of the proposal, finding two offensive 
clauses* The firsts involving time commitments of the principle invest 
gator, was worked out. The second clause involved a more fundamental 
conflict* The clause said: 

ARTICLE 28. DISSEMINATION OF DATA . No subject data as defined 
in the '^Rights In Data"' clause in the General Provisions may be dis- 
seminated with the express written consent of Contracting Officer* 

University counsel objected to the clause on grounds that it violated 
academic freedom of Inquiry and expression* The principle investigator 
concurred in a telephone convarsatlon with Charles Seibert, Assistant 
Director of Northwest ern 's Office of Research and Sponsored Projects. The 
University signed part of the contract but returned the offending portion, 
unsigned, saying that the clause would have to be ranoved before the Univ.er^ 
sity could commit itself to the work* The partly signei contract was sent 
to the USOE Contracts Office on September 27* During September 28, Seibert 
spoke to staff mMbn? a of the Contracts Office, and was told that the Office's 
position was inimutaLjie * The clause would stay, Seibert then assumed the 
contract could not be consummated then, and since the fiscal year ended 
on September 30, a Sunday ^ it could not be consummated at all* 

On >tonday, October 1, at 1*50 p.m., Boruch called (OED) to verify the 
problem and to determine if inde^ the Holtzman Project had gone up in 
anoke* The OED announced that the Contracts Office had d el e j ed the clause 
and sign^ its portion of the contract over the weekend after some argument. 
We ware both told that the National Madwiy of Sciences had taken the 
&L^iiw position as Horthwestarn 's independently in negotiations on another 
rctA'.^act, and the clause had been deleted for NAS also. Boruch called 
lhc\ kintracts Office to confirm this, and did so* 
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Exactly tha same arguments appear in international debates by directors 
of statistical agencies. Recent meetings of the International Statistical 
Institute in Warsaw, for instance, focused heavily on whether thosa agencies 
should be responsible for analysis of data as well as data collection, or 
whether routine analysis by independent agencies is warranted. 



in 



:nis Project, the question emerged in interviews with federal agency 
staff, and was ijnplied oftan by Congressional staff. The debate is complicated 
because bureaucratic territory and political power are at issue. An agency 
may prefer internal evaluation partly because budget can then be increased 
to conduct the evaluation. 

Federal Cont c.^ctQrs . We have encountered no evidence of falsification 
of data and no complaints about the macter for large federal contractors. 
Arguments do arise, however, over technical analyses and what can be Inferred 
from the data, and occasionally over th^?. extent to which a contractor's recom- 
mendations are based on the evidence. Vhese seem to us to be more a matter 
of competence and reasonable differences in values than deliberate deception. 

The questions answered by the federal contractor are normally posed 
by the federal agency staff. If those questions are Inappropriate or narrow, 
the answers, regardless of their accuracy, will be partial at best. We 
have found CQntractors who believe agency staff asked the wrong questions. 
For example, Clark Abt maintains that the focus of research on Follow Through 
should have been on identifying remarkable successes. But we've encountered 
no real questions about integrity of OED staff in performance of their duties. 

A number of mechanisms are already in place to enhance quality of work. 
To some extent these make integrity a moot Issue. The peer review systCT 
does operate for any grant or contract issued by OED and NIE, except for 
regional laboratories and centers, and Congresslonally mandated projects. 
Review and advisory boards do exercise oversight on quality design, execution, 
and anaysis, though this varies from project to project. In principle, data' 
are available for reanalysis and indeed a variety of reanslyses have been 
undertaken. 
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, Jndependence at the Local Level. In large school districts that have 
evaluation units, the units depend most heavily on school operating funds 
for their support. They do however depend on federal funds directly To 
Illustrate, the Austin School Districts Office of Research and Evaluation 
was created in 1973 on the basis of an ESEA Title III grant. District funds 
were gradually used to build up the unit and current state funding permits 
hiring research personnel. Fiscally, then evaluators are not entirely de- 
pendent on their school district resources, neither are they completely " 
independent. - 

Administratively, evaluation units are responplble directly to super- 
intendents rather than to program directors. In this sense they are in- 
dependent. However, they provide service to programs and their likelihood 
depends at least partly or cooperation of staff and their relations with 
institution otaff. - " 

191 



5-32 



Administrative Independence was not identified as a problem in the 
districts in our site visit sample: Most of the evaluation units reported 
to the central fLdministrat ion either directly or through an intermediary 
unrelated to the programs. Results from the CSE study, suggest that 
administrative independence has either been achieved or is not a serious 
problm for most large districts. For districts without an evaluation unit, 
administrative Independence is less frequent. The options here for promoting 
independence are to encQurage the establistoent of evaluation units for 
moderate size LEAs through the use of federal funds, as in the case of the 
^stin School District, For smaller districts j there simply are not enough 
pupils eligible for federal assistance to \,mrrant establishing a unit, the 
use of contractors might be urged or numerous smal districts might pool 
their resources. Consortia have also been offered as an option to ensure 
the availability of sufficient monies to attract high caliber outside con- 
tractors , 

Lack of Fiscal Independence for Evaluation, One problem frUat we en- 
countered in our site visits and phone conversations -concerns the issue of 
fiscal independence. While evaluation units in most SEAs and some LEAs are 
administratively Independent of programs , they usually do not have control 
over the allocation of evaluation monies. This responsibility typically 
lies in the hands of program directors or adminlstratdrs of federal programs 
and may invite conflict. For exmiplej in Site Gj the Director of Federal 
Programs controlled the Title I evaluation monies. Consequently, his approval 
of monies for Title I evaluation efforts and mven evaluation staff attendanGe 
at Title I workshops and professional conferences was required, Stoce he was 
not particularly receptive to evaluationi this consent was not always forthcoming 
and resulted in restriction of opportunities for staff training, professional 
activity, and improvement of evaluation activities. The conflict ultimately 
contributed to the resignation of a highly trained Title I evaluator. Another 
district, although permitted by the SEA to allocate up to 5% of Title I monies 
for evaluation, was ©nbroiled in disputes between the Directors of Title I and 
the Evaluation unit. While the Director of Evaluation wanted to improve the 
quality of Title I evaluation efforts by engaging in activities which went 
beyond compliance ^ the program director was unwilling to allocate the maximum 
set aside permissible. Consequently, these evaluation units which wanted to 
taprove evaluation efforts or respond to other district requests were hampered 
in their efforts to do so. 

This is a complex issue and the need for alternative allocation mechanisms 
Is exemplified by the situation in Site B, Here fiscal dependence did not Idjnit 
the nature of evaluation aGtivities but rather allowed greater flexibility. 
Prospective budgeting is often difficult in evaluation as unanticipated problems 
arise in data collection and other such tasks. In this district, the Director 
of Federal Programs did award additional monies for evaluation when problems 
arose and a strong case was presented. This would not have been possible when 
fiscal independence eKlsted, However , this partly was conditional upon the 
Director of Federal Prdgrams ' positive attitude toward the evaluation process. 
Relying on the presence of positive attitudes to ensure cooperative relationships 
is insufficient—other options need to be entertained. 
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to facilitate indapendence seems warranted* 



Policy on Independe nce , Few school districts 
garding independence of the evaluatian unit. The 
District is a remarkable exception. District Supe 
for eKample, has prtwosed that to assure independe 
must support the i d&. t hr^jugh explicit policy and 
A^in policy niak^» unit responsible to super 
^^Ific program dUK^tors, encourages open coianu 
umt and the public, media, and school board and 

for assuring the unites independence. The latter *^^uu^ v^^c^mg r 
for hiring evaluation staff, deslgrang evaluations, and managing a 1 
evaluation funds to the evaluation unit. Reports issued by the unit 
only the names of unit authors and program staff are forbidden to al 
reports but are encouraged to respond to and conmient on reports Th 
procedures are summarlEed in the exhibit which follows. 
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Credibility and Trust at. 
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utilization and survey suggests that credi- 
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Recent surveys of 39 large 
Paula Matuszek and Ann Lee, appeJ 
do have high credibility. Few ofl 
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dependent assessment of their unit, 



lOol district evaluation units, by Austin *s 
'"o support the idea that established ^nics 
^e directors, for instance, maintain that 
ustlficatlon for spending money on in- 
The improvement of the unit's evaluation 



designs and communication are far more Important justifications. It 1 
however, that these directors of large school district evaluation unit 
believe that independent external Evaluation of their units will have 
dental but notable positive effea« in the sense of Increasing pr^stl 
credibility, and t^* - 
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Independence of Evaiuatiofis bv Office of Research and Evaluation 



TiiB OfI^:ce of Res^iteh and ^v-ilur? t .Ion has the independence necessary 

to a^isyr^ unbla^^fed, ferthright reporfs of the district and of Indi-- 

vidiial pvograms' standing or achievement. This independence is in- 
au.vet^ by the following procedures^ 

1, E%^aluation staff inember^^^ are st-lected by and 
rGsponsible to the Office of Research and 
Evaluation * 

2^ Final authori :> with respect to evaluation designs 
until thay are presented to the cabinet and board 
rests with the Office of Research and Evaluation* 

3* Funds for evaluation are administered by the 
Office of Research and Evaluation. 

4* Reports shall bear only the names of evaluation 
office staff on the title page. A separate sheet 
listing program staff members mays hoirever, be 
inserted at another point in the report, 

5* Reports prepared by the Office of Research and 
Evaluation are not altered in any way by AISD 
staff members who are affected by the evaluation 
although all mmbers shall tave the right to 
respond or objttt to any evaluation report and 
to have their comments presented along with the 
report , 



Fromi Davidson, J*L, The research and evaluation unit: Helping 
your school board make decisions. Paper presented at 
the tonual Meeting of the American Association of 
School Administrators 5 Atlantic Cityp New Jersey , 1976 
(Available from the author^ Superintendent, Austin 
Independent School District, Austin, Texas), 
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Apart from the natural interactions which engender trust between 
evaluator and audiences for evaluation, specific administrative action is 
warranted. For Austin School District's Jack Davidson, the action should 
take the fom of assuring independence, quality and credibility. In par- 
ticular, he maintains that administrators must be committed to the free 
flow of both positive and negative information and the superintendent's 
responsibility to bring the school board, the public, and the staff to 
this philosophy. His rejuarks before the American Association of School 
Administrators make plain the need for open reporting and Independence of 
the Evaluation unit. 

Candor in Repurting at the Loca l Level. The pressures affecting candor 
w local reporting to federal agencies are peculiar. There is an inclination 
to report honestly, but there are also reservatiotis because of a fear that 
funds will be cut if the report comes out unfavorable. On the other hand, 
individuals recognize that funds are rarely if ever cut on the basis of 
evidence, and such decisions are based on other factors especially "politics." 
Complicating this is the fact that targets or goals can easily be set so 
that progress appears to be made, eliminating the possibility of grant 
termination on grounds of failure, and yet assuring that funding on the 
basis of need will continue so as to permit achievement for particular goals. 



These Issues of integrity and honesty also surface with regards to 
outside contractors, when they are used. One reason to hire an individual 
external to the district office is to ensure that information and results 
will be obtained which are more free from bias resulting from Internal 
pressures and prejudices. Trust is placed in the objectivity and integrity 
of the contractor. However, at the same time, observations have been made 
as to the lack of research integrity of some hired contractors, either 
due to the district's naivete about selection, the lack of available 
qualified individuals, or, surprisingly, the nature of the contractual 
process itself. Contractors themselves have cited incidents of lost reports, 
district pressures (implicit or explicit), and purposive dilucion of reports! 
Thus, it must be noted that district or program bureaucrats themselves often 
play a role in reducing the integrity of outside evaluations. 

Some reports are undoubtedly withheld but the incidence of this from 
our site visits appears to be small. Unpleasant reports simply do not get 
wide circulation; more interestingly, they can result in harsh evaluations of 
the evaluator "s competency in general. David's case studies of 15 school 
districts generally confirm this for Title I. Positive standardized test 
results generally serve to foster positive feelings toward the evaluator 
and his/her capabilities. Negative results are ignored or "explained away 
as inappropriate," 

Co nstraints on Capabi litiea 

The constraints on capabilities do not depend solely on individual skills, 
but also on factors associated with the evaluation context. In this section, ' 
we address those issues that inhibit both fedarally required evaluation efforts 
and the conduct of special evaluation activities. 
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Inadequata I^umbars of Evaluation Personnel s No clearcut criteria exist 
for determining the number of individuals needed for adequately conducting 
evaluation activities at the SEA and LEA levels. The size of the staff 
is partly determined by the kinds of activities assigned and also by the 
size, number^ and nature of the specific federal programs* For example^ 
SEAs can be responsible for aggregating and reporting data to federal agencies 
and providing technical assistance to their local districts in submitting the 
proper information. T^hile It may take only a few individuals to compile the 
necessary statistics for federal agencies, providing evaluation assistance 
to LEAs requires more personnel. Once additional activifTes are undertaken 
(e.g.j the conduct of special studies), the required -s and capabilities 

of SEA staff quickly multiply* 

VThile federal evaluation reporting requirements have not subtantially 
diminished over the past decade, it is unclear whether the number of evaluation 
professionals in SEAs has increased accordingly. Evaluation staff positions 
are still h^vily dependent upon federal funding and state allocation of these 
administrative monies to professional positions* For example, in State VI 
almost 60% of the evaluation unit's professionals were supported by federal 
monies, and results from our phone surveys provide further evidence of this 
dependence. It is unlikely that this situation will change, even In an 
era of increased accountability* In fact, in State IV this atmosphere has 
been interpreted to mean reduced government bureaucracy across all SEA 
departments^ and the evaluation unit has lost two of Its five staff positions 
as a result. It should be noted that this SEA serves over 1^000 districts^ 
similar to State VI ^ but has one-twelfth of the staff. Consequently ^ it is 
not surprising that State IV does not engage in much technical assistance 
to its LEAs while State VI has created its own teclmical assistance unit. 

The same problems plague LEAs* It was frequently noted in our site 
visits that the elimination of federal evaluation monies would result in the 
demise of evaluation units and personnel| given school district's de^ 
dining financial resources. Employing Webster and Holley's criteria for 
evaluation unit staffing^ two^thirds of the units in our site visit sample 
were understaffed* Decline of units has also occurred; for example, in 
Site J the evaluation staff has fallen from 22 to 2 full-^tlme professionals 
within the last 7-S years. Complaints as to "too few staff" were also noted 
by over 90% of the Directors of Research and Evaluation units in the CSE 
survey , 

Inadequate staffing levels can paradoxically lead to undarutilization 
of evaluation units in both LEAs and SEAs. For example, in our SEA phone 
surveys, we found that evaluation units did not typically handle all evaluation 
activities associated with federal programs* Vocational and Special Education 
programs frequently used their own staff for available reporting* The same 
situation was true for LEAs in our site visit sample* Only in those dis-- 
tricts which were adequately staffed (by Webster and Holley^s standards) 
was the majority of evaluation activities associated with these programs 
conducted by the unit. 
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The_Rxpanslon of lechf \3siBtan . Evaluation . Complying with 

federal avaluatlon report! Bremen. ^ i not require highly sopMs- 

txcated evaluation capabll n.w..P. , does require that Lchnl^al 

assistance m meeting these luai .iremencs be made available; 

slnce'tii^"' ■ ^ compleKity. For example, 

Issistance^Spr"h ' ' Reporting System, the TectaLai 

for ^v^r% ™ ^ ^ training Individuals responsible 

for evaluation reportmi n SK iAs. This has included such activities 

and thf ^hT"'f the ..lecc. program participants, the use of tests, 

w^^P ^ ^" evaluation . and the development of computer soft- 

w«e The provision of th .e ger, .as specifically targeted at evaluation 

comn^ritr^; ' ^^^^^^ district receiving Title 1 monies must 

comply with these mandates ...I u.ten do not have the trained personnel to do 
SO on tnexr own. r __ 

althouariS: programs have not en^^i^ed this extensive assistance, 

district. Fn^ t be required to provide these services to their respective 
districts. For example, while one role of an SEA may be to provide technical 
assistance in Title Vll evaluation, all SEAs are not'^adequately staffed or 
d^a^tten'^i^ ^ ^li^i^ activity and thus have fewer resources upon which to 
fhe ^ ^ counterparts. At the same tlae, districts have voiced 

the need for aid m selectmg competent outside contractors and answering other 
S Sta h5f Title VII, Special and Vocational Education sLff 

^ tl \l * u ""^""S the need for technical assistance, were doubtful 
to twr ^eI. the adequate numbers of trained staff to provide this service 
models developing evaluation designs and alternative 

1^. 1...^"^!? °^ ^"hancln p the Quality of Evaluation . The issues relating to 
en^.^.^ ^t'f "^"S^'" previous paragraphs have prUnarlly focused on 
ensuring that capabilities exist for miniinally complying with federal 
evaluation reporting requirements. However, emphasis needs to be directed 
at not only complying but also providing opportunities for improving evalu= 
ation exforts^especlally in districts with sophisticated evaluation personnel. 

A good example of how opportunities could be promoted is analogous to the 
ai"d^^"to ^Jl°^!"^""der Section 183 of Title I. These grants are currently 
awarded to SEAs for the refinement of the Title I Evaluation Reporting System. 
Irnnn^^lf" ^^^^ f posslble for SEAs with competent staff to develop a 
proposal and receive funds to examine such issues of quality control, method^ 

f ^ s^d cost-effectiveness. The same opportunities could be offered to 
competent LEAs across a variety of program areas. 

At present, there exist few opportunities for district evaluators 
competent and eager to conduct evaluation research to improve methods and 
examine issues related to federal education programs. As Webster and Stuff le- 
beam have indicated, federal funds have not been targeted at facilitating 
LEAs to answer questions beyond those generated from required efforts. It 
xs not unrealistic to assume that LEAs with highly competent staff can 
propose and conduct studies which can enhance general understanding of 
educational evaluation, ' **'S "J- 
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In addition J there are few mechanlsffis to facilitate staff development in 
evalimtion for SEAs and LEAs* Title I Tectmical Assistance Center sponsored 
workshops represent one available options but these workshops do not cross-- 
cut other federal programs » Further 5 it Is unclear whether one-tQ=three 
day workshops In evaluation are capable of providing more chan minimal 
exposure to evaluation topics. Other meclmnisms might include the pro- 
vision of monies to promote university and LEA/SEA relationships or through 
the sponsorship of evaluation training fellowships. Concerning the first 
alternative, evaluation staff could receive both training and advice in con- 
ducting their assigned activities, and graduate students could obtain actual 
experience in edvnational evaluation^ Sponsoring a position where such in- 
dividuals as researchers or professors could participate In evaluation activi- 
ties could improve the quality of information and methods used in SEAs and 
LEAs for evaluating federal programs* 

Standards for Sele cti on of Outside Contractors , Outside contractors 
are hired for a variety of reasons: to enhance Independence, limited LEA 
resources 5 and for quality assurance audits. In tams of federal evaluation 
reporting requirements , they are typicaliy hired by local school districts to 
collect information required by Title I and Title VII mandates. However, 
we encomtered some local need for guidance in selecting appropriate con- 
tractors. For eKample, in State II where districts are required to hire 
outside contractors for Title I programs^ guidance is provided by the SEA 
in alerting districts as to the types of tasks they should eKpect of their 
contractors (e.g., classroom obaervatlons) . They are also made aware of 
their rights in the contractual arrangement and the problems which can result. 
This procedure was devised by the SEA Title I evaluator and the TAG, 

However, this assistance is not so coimnonly provided in Title VII pro- 
grams. When there is an evaluation unit present, their staff of cm can 
monitor the process and help to ensure that a competent individual is 
selected. However, chis is not always the case and many LEAs receive little 
guidance as to what standards should be mployed In hiring outsidr contractors* 
School boards and superintendents often complicate the process by only looking 
at the price tag for the evaluation rather than the skills of the bidder. 
Given that these contracts are typically very small, this does not encourage 
large reputable firms to participate in the bidding process. Districts should 
be provided with standards and guidelines to assist thra in the contractor 
selection process and help prevent the possibility of obtaining poor quality 
evaluat ion* 

In our interviews^ we encountered numerous instances where outside 
contractors received a rather Insubstantial amount of money to perform the 
required evaluation** Similar, small allocations were observed in many LEAs 
for Title I evaluations* The major factor contributing to this situation is 
the absence of any standards by which to judge whether sufficient resources 
have been allocated. According to Freda Holleyj the Austin School District 
Superintendent's office has issued guidelines as to the percentage of program 
costs which should be allocated to evaluation. The percentage set -asides vary 
in accordance with the size of the award 1 They recoranend a 10% allocation for 
program awards under half a million dollars; a 7.5% allocation for a million 



*See Chapter 4^ Section 6. 
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dollar award and between 4.51 and 1.5% for awards over a million dollars. 
It s ampossible for us to Judge whether these guidelines ara sufficient : 
Time did not permit intensive investigation. But we believe this problem 
18 pervasive and serious enough to warrant further investigation into the 
consideration of guidelines on contract size for various levels of evalua- 



Footnotes 

1 



J, . e ".the documents cited here and elsewhere in the chapter 
are Siyen in Section 8. References. The citations are given by author 
named in the text. If no individuals are na«ed. then tfe citation is to 
s^rfs'lhelfG^er fA^'' ^^^^^ ^-""h Society, or to an Institution 
^^^l^t^^' in the te.t as 

In^^h"^ ff^ section are excerpted from a review of the literature by 
Boruch and Wortman (1979). See the references in Section 8. ^ 
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CHAPTER 6, HOW ARE THE RESULTS OF EVALUATION USED 



Robert Boruch^ Laura Laviton, 

David Cordray, and Georgma Pion 
(Sactions 6,1-6.7) 

Laura Leviton and Robert F. Boruch 
(Section 6.8) 



Sensible paopla regard nothtog ae useles 

LaFontalne 
Fables V, 19 



Of course s every half -crazed accimulator 
of refuse, who lives among old bottles, 
stacked newspapers, and the like, regards 
himself as emlnmtly sensible, 

Bergen Evans 
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6. HOW ARE THE RESULTS OF EVALUATION USED? 



The argument about whether evaluations are used is Justified. But 
It IS not always well-informed and it is often confuslni. To illustrate, 
we encountered a Congressional staffer who announced flatly at the beginning 

LtS "he'sLTt^ f b"''""""^ '° Con^nittee. FivAinutIs 

iater he said that his conmittee used evaluations regularly in guiding the 

direL'o^'^h °' rf Similarly, at Site we encountered 1 progla^ 

Se meant W "f '^^^ ^^^^^ 'o discover later that 

staffer S administration." for they had indeed been used by 

nnlf5i"' t'lieve the confusion, or at least inconsistency, implied here 

SardW f ''f conflicts among managers, evaluators. and policy makers 

regarding use of evaluation results. Part of the problem lies in agreement 
defSltlon""l evaluation results, so we present some fuSonal 

atl^ TJ section. Part depends on the audience for the evalu- 

ation. Information useful to some audiences is immaterial to others. This 
COO IS considered briefly. 

mm,oTl"l^tl ^^8™e"t over what information is used also lies in flawed 
eSStJT response, and self-interest^ response. We obtained under- 

wheS we belief "fP°"f "Js relied solely on memory and overestimates 

2 l^fo^ believe, they reckoned that declaring infomatlon use was important. 
It IS for this reason that we've enumerated probable sources of bias In a 
of corroW^Lf P^^'"' ^ave developed case histories based 

a few^?^^ " evidence to assure that we can assay use and nonuse in at least 
a tew clear instances. 



6.1 DEFINITION OF USE AND AUDIENCES FOR EVALUATION RESULTS 

The absence of any uniform definition for "use of evaluation results" 
nbff %T;.°^ arg^ent about whether they are Indeed used. The 
absence of definition certataly makes it difficult to verify claims of use, 
and to decide how evaluation budgets should be set. A numbir of efforts to 
assess the use of research and evaluation results have been undertaken over 
the past four years. These efforts, supported by NSF and NIE among academic 
Picture or.f -f " organization, have resuU.d in a'clear" 
picture of Che use of evaluations. The three broad functL:^ definitions 
presented here are based on that work.l mitions 

Use of I nformation in Making Specific Decisions 

"f^ involve modifying program operations or regulations, 
Mucatlon^Sfud' f or constructing specific policy. The NIE Com;ensatory 
Education Study, for example, clearly influenced the form of amendments for 
iltle I programs and some program operations. The evaluation of Follow 
Through did not lead clearly to major modifications of the program nor to 
any major specifiable decision in the legislative arena, though findings 
were used elsewhere. The use of evaluations to make specific decisions is 
not a frequent event, but it is not uncommon either. §he case studies prL 
ro'v^rlff %o'i"r"r "^^'"^ P'^^"^* i« instances we have been able 

iv ^hrSint D? ^ .1^^/ "^'''"« ^P^""= decisions, such as use- 
by the Joint Dissemination Review Panel, are also considered later. 
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Use of InfQmatlon to Enhance Understanding of Issues 

This use encompasses understanding Issues ^ providing context and 
background for policy deyelopment, and Ijifluenclng Ideas and attltudr about 
a program. For instance, the House Report on the Education Mentoents o£ 
1978, notes that prior to the NIE Compensatory Education Study, evaluations 
had led us to believe that compensatory education was difficult to achiave 
and indeed not succeeding terribly well. The Report noted that the pessi- 
mistic view has bem changed to one of greater optimism by the NIE report, 
and the consequOTces of such optimism may be far reaching* This also mplies 
of course that earlier reports were also used in the simple sense of support- 
ing a view that the program was no^t a splendid success in altering the in- 
tellective achievOTent of children. 

Use of Information to Persuade Others or to Confirm One's Beliefs 



The use of information to persuade others s to argue for program 
changes and levels of program support, and tether related uses of 
evaluations are common. For example, former DH^ Secretary Califano cited 
evaluation evidence in testimony supporting particular program changes in 
Title Is during reauthorization hearings* Almost all of the witnesses 
representing nine states and testifying in March 1979 hearings of the House 
Subcommittee on Elraentary, Secondary, and Vocational Education cited posi- 
tive outcomes of local and state evaluations to argue for funding for Title I 
progrms. This use of evidmce includes supporting or confirming one's own 
beliefs* 

Using dJiformation for rhetorical purposes is legitimate relative to 
some standards* They are clearly not legitimate relative to others* Re- 
sults of badly designed evaluations^ for instancej have been used to argue 
for reduced budgets and for increased budgets, for modifying regulations ^ 
and for keeping evaluations as they stand. Well-designed evaluations may 
lead to less equivocal conclusions, but the recoranendations drara may hav^ 
little or nothing to do with the data. 

Different Strokes for Differmt Folks 

Audiences for evaluation results have been considered in Chapter , 
To summarise here, they Include policy-makers, managers, and oversight 
agencies at national, state axLd local levels. At the local level, ths 
audiences can also include parents, parent advisory groups, and teachics. 
My particular audience contatas individuals who are indifferent and others 
who attend to results* There is considerable variation across programs, 
across school districts, and across states, Focusijig on particular 
audiences during the evaluation planning process is critical simply because 
the information made available to one audience may be perfectly uselisj >\o 
another. Finally, it can take a good deal of tline to decide wMt it/ forma tiou 
is moat useful to which audience if we may judge from the 18 months required 
to set up evaluation and reporting plans by the Bureau of Education for the 
Handicapped, and over six months planning ttoe required by the NIE Com'- 
pensatory Education Study. 
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of 1. i" ® =°i^sider Henry Brlckell's examination 

°t ■ J^^"^^ °^ infamation demanded by NIE, OE, ASPE, 0MB, GAO and others 
of a NIE-supported evaluation of a career education project. NIE and OE ' 
wanted answers to a variety of questions about program execution and about 
thattf!?d K l^^ °" children. BOAE. the manasement group wichin OE 

in thr^^ consumer of the evaluation, was most interested 

m three questions., mat can BOAE learn from this evaluation about running " 
sic«? programs? Is the evaluated program transportable to other 

™« can this evaluation tell us about how to Implement the program? 

tSefher NT^'""^^'"'^"f"'^ ^ "^^^^^ P"S^^ evaluation reflected 

whether NIE was domg work in the areas in which Congress and the public were 
interested and whether the focus of NIE should chan|e. 0MB was not interested 
in the evaluation report ^ se. but in the rationality of NIE's program plan 
GAO was interested in an^ information needed for audits, but especially effect! 
iveness information. State legislators interviewed about their^inf ormation 
needs requested every single kind of report available. 

a.. 5 °^ evaluated programs and the six local education 

agencies that had adopted the program had other concerns. The managers 
nubH K 'T^" '° ^P"^^ the operating program and how to inform the 
public about the project. Program specialists concentrated on reports dealing 
with their area of specialization, in efforts to Improve the curriculum. 

rn members in districts that might adopt the program wanted 

to Imow how the project was introduced into the six LEAs that had adopted 
the program originally. This information would assist them in understanding 
acLn^^M operating the program and the degree to which it would be 

acceptable to the community. Superintendents also wanted information on the 
introduction of the program into new LEAs, but In addition, reports on de- 
velopment of new tests for the program, and a new teacher manual were also 
m demand. Professional associations were interested in parent, pupil and 
ffiaterials " program, and in the field tests of curriculum 

UCLA's Alkln, DaiUak, and White's Intensive five case studies of five 
local education agencies reach analogous conclusions. Considering the use 

others ^ ^^^^^ ^^'^ P"8ram. a Title 1 program, and 

others, Mkin et nl reiterate the Importance of the different audiences 
trLtr. f f"^' "f'' evaluations quite differently from school dis- . 

tricts and frequently local uses had little to do with the reporting require- 
ments of the states. Many of the uses made of evaluation Information were 
determined jointly by the content of the evaluation, the situation, other 

."^r characteristics. In no instance could evaluation be 
'tudl^= dn «f8le datum on which a decision was made. However, the case 
studies do provide evidence that evaauatlon played a detectable role in 
changing thinking and in making decisions. 

More generally, an analysis of UCLA data on over 200 research and 
«^nH M^^T ' directors in school districts, conducted by Resnick, O'Reilly 
and Majchrzak at UCLA's Center for the Study of Evaluation, help to conf Im 
these results and give a more general picture. Evaluation directors said 
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there ia greater use of evaluations by superlntmdents and school boards 
when the evaluation gave infonnation about allocation of resources. Program 
directors^ on the other hand, were viewed more often as using evaluations 
when they dealt with curriculum selection and modification* Superintendents 
and teachers also tended to use evaluations more for this latter purpose. 

6.2 SOURCES OF DATA ON TRE USES OF EVM.UATION RESULTS 

Since 1977^ several major studies of use at local, state^ and federal 
levels of government have been undertaken* For the local level, these 
include SRI International -s Study of Local Uses of Title 1 Evaluation and 
their Evaluation of the National Diffusion Network^ the UCLA Center for the 
Study of Evaluation's survey of school district evaluation units and Alkto's 
case studies for school district uses of evaluation. Rand Corporation -s Study 
of Federal Programs Supporting Educational Change and Datta*s critique of 
Rand's report, and the Huron Institute's current Study of Local Uses of 
Evaluation, Local school district offices of evaluation^ such as Austin *Ss 
have undertaken smaller studleSj which have been reported in the professional 
journals and they are no less useful *2 

Inv(^.stigatl0ns of act3^/ity at the state level are less frequent. They 
Include SRI International's study of the National Diffusion Network, the 
Mitchell study of utlllEatlon by state legislators^ Joan Bissell's work on 
uses by the California state legislature^ and the Hope Associates study of 
Title I Technical Assistance Centers, 3 

For the federal level* we rely on the Office of Education's Mnual 
E^^aluatlon R eport and related report s^ Heartegs on Costs, Bmefits^ and 
Utilization of Evaluation by the U,S* Senate Conmittee on HiMan Resources, and 
pertinent reports of other conmilttees. We also use results of surveys of Con- 
gressional staff maabers conduted by David Florlo, Harrison Fox, and Hlllel 
Weinberg J the last two being current and former Congressional aides. In 
addition, we include information from selentftd cRm^ studies^ developed by NIE 
staff membersr Datta's study of Headstart and Mlllsap*s case study of use of 
evaluations In regulation OTlting, Finally^ we rely on our case studies and 
surveys. Details are given in Appendix 3 and Footnote 4 of this chapter, 

6.3 USE OF EV^UATION RESULTS AT THE NATIONAL LEVEL 
The Mnual Evaluation Report 

The Office of Evaluation and DlssOTlnatlon has Issued a formal Annual 
Evaluation Report on programs administered by the U,S. Office of Education 
since 1971. Information on uses of evaluation reports has been routinely 
reported stace 1974, Early reports confined attention to uses of evaluation 
"studies" while the most recent cover "evaluation activities" including the 
generation and distribution of manuals for local-schpol district use. There 
la notable overlap the studies cited from year to year, with new ones 
being added as they are us^ and old ones eliminated as a "use" becomes 
obsolete. The latter includes^ for instance, use of studies on alternative 
formula for Title I fomula allocation for Public Law 93-380 in 1974-75, 
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the Annual Evaluation Report for 1979, some 42 evaluation activi« 
ties are enumerated in the section on use of evaluation products* 

* 6 itons refer to production of technical manuals 
^ 3 ItmLS refer to production of data tapes 
. 33 refer to reports on evaluations. 

The presmtation is summarized in the exhibit attached. The index of use 
for manuals Is distribution and sales. So for instance some 14,000 Handbooks 
for guiding LEA efforts to evaluate the inpact of their programs have been 
sold. The index of use for tapes is distribution. For instance^ a survey 
of state and local use of education funds, required by law and condueted 
through OED, is said to have bem distributed to a variety of government 
agencies including the Congressional Research Service, universities, research 
institutes, and the National Education Association, 

The 33 evali2ation reports mentioned in the Mnual Evaluation Repor t 
can be classified only very roughly Into categories to obtain some feel^for 
the products* 

Exploratory, planning and needs assessment 10 
Process, Impl^entation, compliance 24 
Istlmatijig effect of programs on clients 6 



The total does not add to 33 since some studies, of bilingual educat 
for instance, have multiple objectives. 



ion 



Of the 24 evaluations bearing on Implementation of programs, almost 
all are said to have been used in managraent decisions of one kind or 
another. So for instance, evaluation of state plans for career education 
programs led to half the states revising their plans. Reports on higher 
education were reported to have been used in developing budgets* Regu- 
lations were changed at least partly because of evaluations of the Mergency 
School Assistance Act, Title 111, Desegregation Assistance under Title IV, 
accreditation practices, and Exmplary Vocational Education Programs. Some 
led to changes in Internal managonMt procedures, e.g*, evaluations of Title : 
migrant education record systems, of earlier evaluations of state programs 
under Title 1, of operations under the toergency School Assistance Act. No 
more than a half dozen of the toplementation studies appear to have been used 
in amending law or in authorisation decisions i Emergency School Assistance 
Act, Sustaining Effects Study, accreditation, ESEA Title IV, Compensatory 
Reading, and Vocational Education for the Handicapped, 

Of the half doEen or so evaluations which address the question about 
what the effects of services are, the uses are mixed. The bilingual study, 
resulted in changes in law, policy, and regulation and has had some effect 
on appropriations. Review of evaluations of career education programs re= 
suited in management decisions to approve some programs for dlssOTination, 
Evaluation of Follow Through was cited specifically in hearings, and some 
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Exhibit for Chapter 6 



FY 1979 Mnual Evaluation Report 
Uses of Evaluation Activitias 



Topic or ^ea 
Title 1 Models 

Title I 

Title VII 
Biltoguai 



^nual data 



Title III 
Special Projects^ 



13arear Education 
Search 



Carear Education 
Career Education 

Methods 



Product 
Evaluation Models 

Data 

Report 
(Effect) 



Bilingual 
Itoterials 
Development 



Tapes 



Identification of 

127 Eff active Projacti 



EK^plars: 7 out of 
257 with avldenca 
Local need for 
evalua t ion ( Ef f ec t ) 

Evaluation Handbook 

"Evaluations" of State 
Plans for Career Ed 

* Hand book I maaiurtag 

project Jinpact 
*Guide to validating 

gains 

.Sampling procadurea 

•Study of state of 
Mterlals 

.InvTOtory of materials 
•Proposed regulations 



Evidence on Use 

* percent of school dis- 
tricts using models 

, Public Testimony 

.Modification In Law 
« Internal audit and 

tracking created 
•Itenagisnmt change 
.Report (HR 7555) 

citation 
.Regulations 

.Users J such as Rand, 
UC, NIA 

,2,185 adoptions during 
1978-^79 

.Rhetorical emphasis 
on Implementation 

.JDRP/NDN management 



Commercial publication 

Half of states 
revised plans 

14,000 sold 

12,000 sold 

no information 

proposed regulations 
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12* Higher Education 
Title III 



13. National ILongitud 
Study of the High 
School Class of 19 



14. Consumer Protection 



15. Higher Education 



16. Higher Education 



17. Higher Education 



18^ Emergency School 
Assistance Act 

19. Qnergancy School 
Assistance Act 

20. Desegregation 



21. Desegregation 
Assistance 
Title IV 

22 * ISM 



23, Follow Through 



24. Basic Skills 
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port on jinstitutlons 




A- 

3a^a analysis 



IbOjbk i Student's 
iumer guide to 
^and Qccupa- 
fed 

1 Report ; 
.ve approach 
rlbutlon of aid 
lents 



Stu^ of NonpE^it 
gani^tlons (NPOs) 

.Studjfcn^need ^nd 

mwa^bm!^ 

.Handbp^ for Integrated 
Schoe 



Study 




r 



SustalnlJig Effecrs Study 



Changes to regulations » 
but unapecifiad 

Use in Bakke case 



"Commercially dis- 
tributed" 



No evidanca on use of 
distribution 



Preamted to testimony 
by Comiisi loner 



."Used by staff to 

develop tog budget" 
."Used by congressional 

staff to reauthorization," 

.Specific changes in 
draft regulation 

.Specific changes in law 

p Specific management changes 

70,000 copies "distributed" 



Substantial ravlaion of 
regulations 



Used to reauthorization 
(HR 15, Hearings ) 

* Citation to Report 95=1151 

on (HR 7577) 
.JDRP Review of 21 

proj ects 

•Feedback on evaluation 
design unspecified 
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25* Budget 

26« Loans 

27 * Accreditation 

28, College Finances 

29, ISEA, Title IV 

30, Educational Change 



Budget Projection 
Models 

Survey of imders 
Study 

Study 
Study 
Study 



31. ESEA Title I Study 
Neglected Childr^ 

32. ESEA Title IV Study 

33. Compensatory Readijig Study 

34. Upward Bound Study 

35. Planned Variations Report 

36. TV Report 

37. Voc Ed/Handicapped Report 

38. Voc Ed/Disadvantaged Report 

39. Exemplary Voo/Ed 
Programs 

40. Community Right to Read Report 



41, Title 1 Migrant 

42, Reanalysis of ESEA 
Title I Reports 



Report 
Report 



«No specific evidence on 
uses cited. 

No specific information 

,FTG regulations change 

.1976 law change 

.Carnegie Commission Report 

.Use in policy paper 

.Nonspecific influence on regs 

.Nonspecific Congressional 
action 

.''Report shared" 

Specific Citation In Hearings 

Specific Citation in Hearings 

Regulations revised 

Letters attesting to 
usefulness 

Book Award 

1976 Mendments change 
"made available" 
Changes in regulations 

"Significant contribution to 
establishing guideltoas for 
Reading Acadraies 

Conversion for fund allocation 

Executive document (unspecified 
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management decisions were made in the sense that some Follow Through models 
were reviewed as to effectiveness and approved for dissemination by JDRP. 
Results of Title I assessraents were used theoretically, to argue for con- 
tinuation or expansion of the program. Evaluation of Upward Bound programs 
resulted in some changes in regulations. 

This sumnary and the information provided in the Annual Report is a 
bit too terse to do justice to the topic of management uses of information. 
For instance, the evaluation of the Experience Based Career Education Program 
CECBE) developed by Nil, was clearly used in the formulation of regulations 
for exemplary projects under Part D of the Vocational Education Act. The 
evaluation suggtated that this experimental program was notably successful 
in achieving desirable goals. As a result, the regulations gave priority 
for funding under Part D to replications of the ECBE program. Other pro- 
posals for projects under Part D would have to show that they were at least 
as effective as ECBE, through evaluation evidence presented to the Joint 
DissCTiination Review Panel. Sixty percent of subsequent grants under Part D 
were replications of ECBE, while another 25% combing ECBE with other programs. 
OE has used evaluations to improve its National Diffusion Network. For example, 
a study of the Project Information Packages showed that, although PIP's 
brought about ImprovHnents, personal assistance was also necessary to im- 
plement innovations in LEA's. As a result, developers were funded both to 
develop materials and to provide personal assistance In Implmientlng in- 
novations. Several evaluations (RAND change agent study, evaluation of 
PlP^dlssmiinatlon and implfflientat ion) indicated the need for assistance to 
LEA s in tailoring the Innovation to their specific needs, and developers 
were funded to do so. 5 

We have no evidence that the catalog of uses of products enumerated 
^nual' Evaluation Reports Is untrustworthy. The information presented, 
however, is often not sufficiently precise to permit an outsider to verify 
it. To be sure, there are explicit references In the 1979 Report to specific 
hearings and to law in some cases, five or so in disucsslon of Title VII bi- 
lingual education. But there are about 17 instances in which citation is 
incomplete. "FTC regulations. . .developed partly on the basis of findings " 
proposed regulations," and so on. Apart from this, the actual title of the 
evaluation report which Is said to have been used falls to appear in a dozen 
or so cases. Congressional reports are often similarly incomplete in citation. 
It is on account of imperfections in citation, and in the interest of veri- 
fying OED contentions that we undertook the case studies discussed later. 

The OED list is also a bit confusing In the simple sense that evaluations 
technical manuals, and production of data tapes are all combined. The catalog' 
is unlnformative about periodicity In use of information. Because It is ' 
cumulative and because citation is imperfect, any given entry may refer to an 
evaluation completed anytime during 1974-1979. This makes the report of limited 
usefulness as an index of productivity of the evaluation unit. 

The catalog does not enu merate studies that have had little use to policy 
m akers, managers, and oversight age nc ies, and that have evoked little pro- 
fessional Interest. It is not unreasonable to do so, in the Interest of 
accounting for evaluation monies. The task need not be left to the agency. 
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CQTigressional Staff Views 

The views of Congressional staffers with responsibilities pertinent to 
evaluation are mteed but tend to be positive. So for Instance^ David Florio 
conducted Interviews with 26 staff mmbers concerned with educational issues. 
They found that the focus of these staff mmbers was less on general taiowledge 
than the specific policy issue at hand. General taiowledge about research 
and evaluation did filter up to these users from other sources. These 
sources, ranked in order of Congressional reliance on thm, Includei 

GAO, CRS and other Congressional agencies (37% of staff mentioned th^) 

Federal agencies (29%) 

Professional associations (12%) 

All other sources (for example^ home based) (22%) 

Washington-based sources of Inforaatlon predominated (80%), 

Evaluations oust compete with other sources of Information Congress 
receives. In the Florio survey, about a third of staff members felt that 
research was consistently Important as a source of information. Mother 
third felt that the Importance of research Information depraded on the issue 
at hand. Most respondents felt different types of research information were 
useful for their purposes at different points in the legislative cycle, A 
majority of staff members believed research was useful for development of 
issues^ deliberation, and oversight. However^ the group was split almost 
evenly as to whether they felt research was useful for decision-making itself. 
This may reflect the Congressional view of what the decision-point Is, n^elyp 
a vote and compromise. We taow from studies of legislators that as votes 
approach, positions of legislators harden, such that research Information 
becomes less influential* 6 



An earlier survey by Harrison Fox, a Congressional staffer for former 
Senator Brock, supports the finding that evaluations are a valued source 
of djiformatlon for oversight, at least in principle. Senate staff partlci-- 
pating in the survey believed that evaluations were effective for this purpose, 
rfinking only behind hearings and meetings, staff comunicatlon with agency 
personnel, staff investigations, and audits of agency programs. 

In the Florio survey, different types of research inforaatlon did 
not have equal taportance to Congressional audiences. Cost of the program 
to the governmOTt was most Important, and student's achievement ranked 
second, A variety of other Infomation followed in the rankings ™ but 
note that costs and achl^mait scores represent goals on which everyone 
agrees, for almost all educational progrms. The Importance of other Infor- 
mation undoubtedly depends on the issue at hand. 
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Congressional Use of Eyaluatlons; Conmlttee Reports 

Federal agency staff members often complained about the extent to 
which evaluations were not or could be used by the Congress. We could not 
investigate this problan Intensively in the time available. However, we 
did review various committee reports to understand the extent to which evalu- 
ations were mentioned, and the extent to which a reference to evaluation 
carried sufficient Information for the bureaucrat or outsider to determine 
what evaluation was useful. The following ranarks sunmarize some of that 
work. Case studies which trace use of evaluations In Compensator/ Education. 
Bilingual Education, Innovative ProgramSj and others ara given later. 

The contents of the Report of the House Committee on Education and Labor 
on the Educational Mendments of 1978 reveals nearly 30 references to the NIE 
Study and one contractor Is Identified. There are 5 references to GAO assesa- 
ments and 4 references to OE evaluations. Two OED contractors were cited 
without reference to OE. The Report of the Senate Human Resources Committee 
partly duplicates the language of the House report. In both Reports , criti- 
cism of OE was reiterated. The main basis for criticism was the GAD' s report 
on OE evaluations. 

The 1979 Report of the Senate Committee on Appropriations provides a 
terse rationale for each of the 85 or so Items on which judgments are made. 
There are 11 references to evaluation, Including 3 GAO reports. There are two 
verifiable references to OED supported studies. None of the references are 
specific in the sense of specifying a title of the study or document. The 
conscientious inquirer can presumably go to the Hearings . 

The 1980 Report of the House Committee on Education and Labor reviewed 
elements of each of thirteen Titles. There are six references to evaluations 
including one each produced by OE, GAO, and the IG. The only specific refer- 
ence is given to the evaluation of the Fund for the Improvanent of Postsecondary 
Education by Sol Pelavln. The office which sponsored the study, ASPE, is not 
acknowledged. 

Reports of the House Committee on Appropriations for 1977, 1978, 1979 
cite evaluations on nine occasions. Two of these are verifiable references to 
OE studies and one to GAO, 

From this small search, we conclude from the more conscientiously 
constructed reports thit: (a) On average, evaluation is mentioned and 
presumably used In 1 out of 8 or 9 cases In which Judgments about individual 
budget itans are , made. In the remaining 7 or so cases, there has bean no 
evaluation at all, or there has been a useless evaluation, or a useful evalu- 
ation has not been cited or has been ignored, or no one thought to mention it. 
(b) The citations to studies are often vague and, at least at times, studies 
which appear to have been used are not acknowledged. (c) Citation rate of 
OED mounted studies Is not much different from citation rate of GAO studies in 
recent years. 
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Exhibit for Chapter 6 

Referaneea to Evaluation ini U.S. Senate, Committee on Himan 
Raeouraes* Report: The Educational Amendments of 1978 , 95th Congress, 
2nd Session, Report 95-856^ 15, 1978, ~ 

Elementary and Seeondary Education r Title I 

^* Grants for the Educationally Deprived i 11 major referTOcee to 
the NIE Compensatory Education Stady, one to an unspecified 
01 Study (probably sustalndjig erfecta), one to a GAO study. 

B. Programs operated by state agencies 

1. Handicapped; No reference ^ 

2* Neglected md delinquent s Nonspecific reference to 
an evaluation (actually OED) 

3* State migrmit program i Nonspecific reference to DHEW 
Inf ormation- 

Pa^roents for state programs i No reference 
C* State Mministration i 3 major references to NIE Study 
D. Federal Administration ; 

1, Evaluat ion i Nonspecific reference to GAO review of 
OE annual evaluation report 

2* Complaint resolution ; No reference 

3. Audits ; Refer mce to the Assistant Secretary DHEW 
Sanction Study 

4* Withholdings I No reference * 

5. Policy Manual I NIE Study referenced 

E* Gmeral Provisions 

1. Basic Skills I Reference to NABP 

2 , Metr ic Educa t ion ■ No reference 
3- Arts in Mucation ; No reference 

4. Congimer Educa t ion i No reference 

5. Youth Bnployment i No referance 

6- Law~related Educatiohi No reference 



Note: In this and succeeding exhibits, ''No refurence" means no mention is 

made of any evaluation. An evaluation may or may not have been done in 
Q each case, ^ . 
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7» ^vlronmmtal Educatlon i No refer ence 

8. Health Education i 

9' Correction Education ; 

Biomedic al Scimceg i Nonspecific reference to NIH 
AMA studies ' 

11 • Population Educatlon i No reference 

Libraries, Learning Resources, Educational Innovation (Title IV) i Nonspecific 
Referrace to Mnd Study of ^novation 

State Leadership (Title V) 

!• Application I 
2 . RulonaklnR ; 

3- Teclmical Assistance i Reference to DHEW Sanction Study 

State Mo nitoring I Reference to GAO report 
complaiat and resolution 

5. Withholding i Nil study cited 

6, Audit S I mm Sanctions Study^ NIE Study 
Emergency School Assistance Act (Title VI) i GAO report cited 
Bilingual Education (Title VII) i Absent 

OE Information mentioned ; nonspeclflG reference to GAO report 

Ethnic Heritage Studies (Title IX) t No reference 

tonmunity Schools (Title X) i Specific reference to OE evaluation 

Women's Educational Equity (Title XI): No referOTce 

Non^publlc Educational Asaistancei Specific reference to NIE study 

Impact Aid (Title 11 ) 

Th ir t^ en it em s g No refference 
Extension (Title III) 

!• Adult Education* Np-4ef erence, but fasclnatiUig 

P^^sage on Emigrant organisations as educators, 

2. Indian Educatlon i No reference 

3. Teacher TraiTiing i No reference 

Genaral Education Provisional Paperwork Commission cited. 
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Exhibit for Chapter 6 
References to Evaluation Ini U*S. Senate, Comilttee on Appropriations 
Report- Dep artmenta of Labor and Health, Education and Welfare Appro- 
priationa Bi ll, 1980, H,R\ 1489 , 96th Congress, First Session, (Report 
96-247), July 13, 1979, 

Elementary and Secondary Education 

Grants for the Disadvantaged i No reference to any evaluation, 
despite Senate use of the NIE Compensatory Education Study, 
Grants f or the Disadvantaged - LEAg - No reference to any evalua-- 

tlon, despite Senate use of NIE* 
Grants fog che Disadvantaged - SEAs i No reference. 
Evaluation and Studies i No reference. 
Concentration Grants l No reference* 
State Incentive Grants g No reference. 
Support and Innovation i No reference. 

Bilingua l Education i No reference to the bilingual evaluation 
despite verified use but the statements depend on the AIR 
evaluation. 

Basic Skills : Nonspecific reference to evaluation. 
Achievement Testing Assistance I No referenca. 
Follow Through ! No reference. 

Alcohol and Drug Abuse Education i Reference to lack of 

evaluative information, 
Environtnental Education s No reference. 
Educational Broadcast Facilities i No reference, 
Ellender Fellowships i No reference. 
Ethnic Heritage g No reference* 

General Assistance to Virgin Islands i No reference* 
Impact Aid 

18 Fomula Cons true tion i No reference to evaluation, despite 

discussion of analyses by ASPE, 

Emergency School Aid 

19 General Grants I A GAO study is mentioned, 

20 Special Programs ^ Pro.iects i No reference. 

21 Magnet Schools l There is a nonspecific reference to the Magnet 

School evaluation, 

22 Grants to Non Profit Organizations i There Is a nonspecific 

reference to a DHEW evaluation, 

23 Educational Television a Radio i No reference, 

24 Evaluation ! No reference . 

25 Civil Rights Training and Advisory Services ! No reference. 
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Education for the Handicapped 

26 State Grants i No reference. 
?| Pre-schoo l Incentive Grants l No reference. 
28 Deaf Blind Centers l No reference. 
II Severely H andicapped PrQ-[ en tw ; No reference. 
30 Early Childhoo^i No reference. 

Regional V ocational Edueatlon i No reference 
„ Innovation & Development- Evaluation alluded to. 
JJ Media Services i No reference. 

Spaeial Education Manpow er Development i No reference. 
■>5 Special, Studies ; No reference, 

Occupational, Vocational, and Adult Education 
36 State Grants ; No /reference. 
II Program toprovem^ t- No reference. 

Special Prog rams! No reference. 
- Consider and Homemaker ; No reference. 
J? State Advisor y Councils i No reference. 
*1 State Planning ; No reference. 

Pg ograms of National Significance ; No reference. 
ff Bilingual Vo cational Trainlne ; No reference. 

CETA and Voc ational Education ; No reference. 
45 Adult Education ; No reference. 

Basic Educational Oppor tunity Grants - No reference. 

Suppl mental - No reference. 

48 College Work-Study ; No reference. 

49 Direct Loans ; No reference. 

50 State Stud ent Incentive Grants ; No reference. 

Higher and Continuing Education 

51 Special Progrms for Disadvantap*,H frmm. No reference. 
„ veteran s Co st of instruction ; No reference. 
ZJ Educational Information Centers ; No reference 
li Peveloping Institutions; GAO evaluation mentioned. 

Cooperati ve Education ; No reference. 
5o International Education ; No reference. 
II University Com nunity Services l No reference. 
ll State Post-secondary Education Coimisslons ; No reference 
it graduate Professional Op portunities ; No reference. ■ - ' 
1° Legal Training for the Dl sadvantaeed ; No reference. 

Public Sei^ iee Fellowshlp a- n» ^kfri^,^^^ 
°% Mining FellowBhips- Evaluation alluded to. 
!, J^aw School C linical Experience i No reference. 
ft Oonstructlon Industry Subaidvi No reference. 
- Architectural Barrier Removfll i Evaluative Information requested 
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Spaeial Projecte and Training 

66 Youth Empleyment ; No reference, 

67 Biomedical Sglences s No reference* 

68 Atm In Edueatlon i No reference, 

69 Metrle Educations No reference* 

70 Congimer Education s No reference* 

71 Gifted and Talented Children ^ No refer race* 

72 National Dlffugion Network s Nonspecific reference to evaluation. 

73 Edncatlonal Television Frogramnlng s Nonspecific reference to 

evaluation In children* a television* 

74 Push for ^cellsnce i No refsrenee, 

75 Career Education Deinonstratlons g No reference, 

76 taw related Education : No reference, 

77 Woa^'e Educational Equity i No reference, 

78 Community Schools s Nonspecific allusion to evaluation^ 

79 Citias in Schools s No reference, 

80 Career Education Incentlvei i No reference, 

81 Gifted ^ Talented ChlldrenT No refer^.nce, 

82 Teacher Corps ; No reference * 

83 Teacher Centers i No reference* 

84 Planning and Evaluation s No reference. 
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Exhibit for Chapter 6 



Title li 
Title lit 



References to Evaluation ini U*S, House of 
Representatives > Committee on Education 
and Labor, Report s The Edueational Amend- 
ments of 1980 . 96th Congress , First Session, 
Report 96-520 p October 17, 1979. 

Outreach 

No reference. 

College and Research Library Assistance and Library Training 
and Research, 
No reference* 



Title 111* Developing Institutions 

Nonspecific reference to formula development 
(OlD not mentioned) 

Title IVi Student Assistance 



1, 

2. 
3, 
4, 

5* 
6. 
7. 

8* 
10. 



BEOG ; No reference 
SEOG i No reference 
SSIG : No refermce 

Special Pro g r ams for Students from Disadvantaged Bac kgroimd s 
(TRIO) I No reference ^ ^ — 

Education InfoCTiatlon ^ No reference 

Assistance to Institutions of Higher Education i No reference 

Low Inte rest Loans i Nonspecific refermces to IG report, 
OE evaluation 

Work Study Programs i Smith cited. 

National Direct Student Loans ; No reference. 

General Provisions ; Reference to GAD report 
Graeral lack of information cited. 



Title 
Title 
Title 
Title 



V: Teacher Corps Md Teacher Traintagi 
No reference, 

VI* International and Foreign Language 
No reference 

VII I Construction 

No refermca 

Villi Cooperative Education 

Nonspecific reference to ''a study," 



Title IXt Graduate Programs 
No reference 



Title Xi 



Fund for the Improvmant of Post secondary Education 
Reference to NTS Evaluation, no mention of ASPE 
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Title XI.* Urban Grant University Progrm 
No reference 

Title Xri- General Provisions 
No reference 

Title XIII I Comnunity College of American Samoa 
No reference 
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Exhibit for Chapter 6. 
References to Evaluation in the House Reports on the 1979, 1978, and 

. . Committee on Appropriations (Reports 96-244, 95=1248 

and gS-'SSl) , ^ s - - , 

Elemmtary and Secondary Education 

^ Grants for the Disadvantaged- No reference to any evaluation 
despite House use of the NIE Compensatory Education Study. 

^ Grants for the Disadvantaged ^ LEAsi No reference to anv evalua- 
tion despite House use of NIE* 

3 Grants for the Disadvantag ed » SEAsi No reference. 

^ Evaluat ion and Studies * No reference 

5 Concentration Grant s i No reference 

^ ^tate Incentive Grants - No reference. 

^ Support and rnnovatlon i No reference. 

8 Bilingual Education. "Reference to '^recent evaluation study: in 

ancouraging action on deficiencies to the program (1980) 

9 Basic Skills - No reference, P S ^ k^^ouj . 

^0 Achiev^mt Testing Assistance i No reference* 

11 Follow Throug h; No reference. 

12 Alcohol and Drug Abuse Education i No reference, 

13 aivlronmental Education s No reference* 

1* Educational Broadcast Facilities - No reference. 

15 Ellender Fellowships : No reference. 

16 Etlmlc Heritage : No reference* 

1^ General ^sistance to Virgjji Islands : No reference. 
Impact Aid 

18 Formula Construct ion i No reference. 

Bnergency School Aid 

19 General Grants : 

2^ Special Programs and Projects: Reference to a "recent study" that 

reexamination of the "follow-the-child" concept is necessarv (1978) 

21 Magnet Schools : No reference. " 

22 Grants to Non Profit Organizations : No reference. 

23 Educational Television and Radio : No reference. ' 

24 Evaluation : No refermce* 

25 Civil Rig hts Tratoing and Advisory Services : No reference. 
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Education for the Handicapped 

26 State Gra.nt.5 i No reference, 

27 Pre-school Incentive Grants i No refermce. 

28 Deaf Bltod Centers : No reference, 

29 Severely Handicappad Project s : No reference, 

30 torly^Childhood ; No reference, 

31 Regional ypcational Education i No reference, 

32 Innovation and^ Peveiopment i No reference, 

33 Media Services : No reference. 

34 Special jducation Manpower Developffimt i No reference. 

35 Spec la i Studies - Reference to BEH study showing 1/3 of hearing 

aids worn by public school children are malfunctioning, (1977) 

Occupational 5 Vocational ^ and Adult Education 

36 State Grants '^ No reference. 

37 Fgograpi Improvgnent ; No reference, 

38 Special Programs i No reference, 

39 Consumer andT Homgaaker ■ No reference. 

40 State Adylsory Councils i No reference, 

41 State Planning : No refermce, 

42 Progrms of National Slgnlf Icance i No reference* 

43 Bilingual Vocational Training i Reference to "recmt studies and 

reports" showing lower mplo^ent rates of limited English- 
speaking population (1979), 

44 CETA and Vocational Education ; No reference, 

45 Mult Education i Reference to 1975 study of prevalence of 

functional Illiteracy (1977), 

46 Basic Educational Opportunity Grants i Reference to computer 

audits finding many Inelliglbles receiving grants (1979), 

47 SupplCTimtal : No reference* 

48 College Work-Study i No reference. 

49 Direct Loan^ ; No reference* 

50 State_ Student In^ent^ye Grant s i No reference. 

Higher and Continuing Education 

51 Special Programs for Disadvantag ed (TRIO)* No reference* 

52 Veteran's Cost of Instruetlon i No reference, 

53 Educational Information Centers i No reference, 

54 Developing Inst Itut ions y GAO evaluation mentioned , 

55 Cooperative Educat^lon i No reference, 

56 International Education i No reference. 

57 University Coflgnunlty Services ! No reference, 

58 State Post -secondary Education Conmlsslons i No reference, 

59 Graduate Professional Opportunities i No reference* 

60 Legal Training for the Disadvantaged i No reference* 

61 Public Service Fellowships i No reference, 

62 Mining Fellowghlps i Evaluation alluded to* 

63 Law School Clinical Experience i No reference* 

64 Construct Ion Industry Subsidy i No reference, 

65 ^chltectural Barrier Rmoval i Reference to HEW data on cost Jjnpact on 

Inatitutlons If barriers are rraoved. In light of demonstrated need, 
fimdtag by conmlttee mhanced (1979) , 
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Projects and Training 

Youth Employment t No reference. 
Biomedical Sciences i No refer enci 
^ts In Education : No reference. 
M etric Education I No reference. 
C onsumer Education i No rafarenca. 
Gifted and Talantad Children : No 
National Diffusion Network s" No 
Educational Talevlslon Programnd 
Push for Excellenca i~ No refer em 
Career Education Demonatra t ions i 
Law Relat edJ Education i 



reference , 
lerence* 

No reference 



refer ence. 



so referenc 

Women ^s Edugational Equit y i No reference, 
Coronunlty Schools i No WaFerenc e i 



Cltias in Schools ; 



No fd 

No rSerence 



Career Education Incentives i Nc 
Gifted and Talented Children i 
Teacher Corps : No reference. 
Teacher Centers : No reference. 
Planning and Evaluation s No retarj^nce. 
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6.4 STATE USES OF EVALUATIONS 

Very little systematic information is available on the uses to which 
states put evaluative information. No surveys and very few case studies 
have been conducted. The material suiranarized here is merely illustrative. 

Joan Bissell of California's Office of the Auditor General has docu- 
mented use of evaluations by the California State Legislature. Although 
the links between the evaluation and the actions and deliberations of the 
legislature are difficult to verify, in most Instances she presents plausible 
linkages, her own experience of the path the deliberations follow, and 
respondents' information about these linkages. She offers four examples. 
The state s School Improvmient Program developed a rating systm for quality 
of school programs, which showed little relation to pupil achievement in 
evaluation. This finding led to changes in the statutes governing allocation 
of resources to programs. In another evaluation, little relation was found 
between level of funding for compensatory programs and achievement of pupils. 
State policies on compensatory education were revised consistent with these 
findings. In a third case, a state program for Mentally Gifted Minors (MGM) 
was shown to exclude some high achievers on the basis of IQ and to have ex- 
cessively high administrative costs. Deliberations were undertaken by the 
legislature to determine the future of the program In light of the Jarvls 
Proposition 13 amendment. Finally, Bissell offers an evaluation in which 
public schools' procedures for contracting with private vocational training 
were found to lead to excessive costs. Hearings are scheduled to determine 
alternate forms of financing such training. 

Mitchell conducted an interview study of state legislators' use of 
social science. He found that legislators use social research when the 
information is brought to their attention early in the legislative process. 
When this occurs, research is used as background orienting legislators to 
the questions, and in the process of forming coalitions. As debate becomes 
more partisan and positions better defined, social research is used as 
political aianunltlon, and ceases to have much influence with those who are 
not already convinced. Mitchell found that social science expertise can be 
highly valued by state legislators, but is only one form of valued expertise 
anong many. 

Federal Requirements thaj Use be Specified . 

The 1976 amendments to the Vocational Education Act (20 U.S.C. 2312) 
require that each state evaluate the effectiveness of programs every five 
years. The "results of these evaluations shall be used to revise the states' 
programs and shall be made available to the State Advisory Council." A 
separate amendment (20 U.S.C. 2308) requires that an annual accountability 
report" contain a summary of the evaluations. . .and a description of how the 
information from these evaluations has been or is being used by the State board 
to improve its programs." The three accountability reports we reviewed happened 
to be readily available. Neither is especially informative and one is dreadful. 
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The Annual Prog gam Plan for 1981 and Accoimtablllty Report for 1979 
for the State of arei ~ 



*'The Agriculturf^.l and Home Economics ttiits**.used the data to 
plan teacher conf erecices* . *The program reviewa have been most 
valuable* , ,f or necessitating state staff parsons and local 
school persons to participate in a joint discussion of the 
programs. CH)opafully 5 progrm improvment will result from 
these reviews should have Improved their competencies and pre^ 
pared them for future progr^ reviews as well as make them 
better supervisors" (sic, p* 41^42)* 

Other material is of the same sort* No specific uses are given* Good grief. 

The State of Accoimtability Report for FY 1978 is rather 

general. The do^mnent says that programs "use the^omposiYe report in 
identifying priorities f or * . *researchj currlculm development^ personnel 
development, student services, elimination of sex blas^ and funding*" 
Other references are to the report as a resource docummt for universities 
and an assortment of oth^ agencies. No more specific infomation is furnished 

The third accountability report says that "as a result of the evalu- 
ations, the state agency has made the following comaitments* * .sponsor 
regional workshops ** *and exmplary programs in guidance. , .working with 
local vocational directors at quarterly meetings to assist in local 
placmMt, * .awarded contracts for competency based materials In six subject 
areas*.* *" There is no attrapt to link any of these actions to ^ specific 
evaluation finding. 

We have been able to locate only two formal studies of the actual uses 
to which evaluative program reviews, required by the Vocational Education 
Amendments, are put. They were conducted by Charles Manning for the Napa 
County, California, School District in 1975 and 1976* Each involved asking 
district administrators what^^hanged as a result of the program. The majority 
of tangible changes which were said to have come about as a result of the 
report Involved additions or restrict ions of resources and changes In ad- 
ministrative procedures* The most remarkable one was recognizing the need 
to hire a coordinator or director of evaluation and the decisions to hire 
one* The most common intangible uses are reported to have been changes In 
understanding of problems and toproved communications* 
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6.5 LOCAL USES OF EVALUATION RESULTS 

The audiences for results at the local level are analogous to those 
at the national level: administration, program staff and teachers, parents 
and school boards who may be both participants and oversight agents. As 
we said earlier, the audiences contain members who are indifferent to 
results. Different audiences dmiand different information. 

Site Vl slta 

Information on the r,haracter of local level uses generated from our 
site visits, is given in the next Exhibit. The conclusions we draw are: 

. there is enormous variability in Interest in av-iluation and 
consequently, in the use of evidence. 

. LEAs with strong research and evaluation units are more 

likely to use evaluations in any of the senses In which we've 
defined utilization. 

. in LEAs without strong units, use depmds heavily on the 
vigor and skill of the individual avaluator, and or local conditions. 

Generally speaking, evaluations appear to attract attention if they suggest 
ways to Improve program operations relative to some standard. Better tests 
are selected, opinions about particular features of the program elicited, side 
studies on very good performers or very poor performers carried out. The work 
may also be undertaken to have an independent appraisal which carried more weight 
in aruging for more funds. 

In very few site visits did we find any serious attempts to document 
the uses to which evaluation has been put. The exceptions occur in states 
with Title 1 regulations bearing on use. For Instance, ESEA Division of 
Massachusetts Department of Education is one of several states that require 
reports on the uses of evaluation. In particular, the Division asks that in 
an Interim Evaluation Report, the LEA report "specific changes (which) have 
been incorporated into this year's project, based on findings and recommen- 
dations of the last year's final evaluation report." Rhode Island makes a 
similar danand of its Title I avaluators and a case study on uses is reported 
later in this chapter. 

The following remarks rely on other studies to describe what is known 
about use of information at the local level. 

Use in Modifying Programs 

UCLA's Center for Study of Evaluation reports that nearly half of the 
Directors of Evaluation Units in 350 large school districts say they spend 
some time "modify (ing) programs using evaluation results." The time Invest- 
ment in this activity Is not high, however. Only about 15% rank It #1, 2, 
or 3 with respect to time demand. Other activities, such as assessing 'results 
or worth of a program (relative to unspecified standards) . account for much more 
of their tme. Presumably the uses to which such assessments are put include 
modifying the program. 
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Exhibit for Chapter 6 

Sketches on the Use of Evaluation 
at the Local Level from Site Visits 

Site A. 

An outside contractor does the avaluation of bilingual educations 
exainining match between goals and activity, and outcomes* The 
report is circulated but no use was cited* The evaluation report 
by the du*^side contractor for Title I is generally COTipleted after 
the new school year has begun so it is useful only to confirm per- 
formance in the preceding year, not for changes in the current year* 
There is no strong tradition of supporting evaluations or using 
them* Testimonials appearing in the school newsletter on federal 
or state supported programs generally get the greatest attention* 

Site C* 

The general emphasis of the LEA is on meeting reporting requirements. 
There is no CTiphasis on evaluation and| indeed ^ the major and minor 
decisions are based on outcomes of bureaucratic and political battles • 
Government requirements do not produce data which is useful to them* 
Evaluations initiated internally are minimal and use is made by progr^ 
staff but not by administration, oversight j or policy groups at the 
local level* 

Site D* 

Instances of use were frequently mentioned, especially for locally 
initiated evaluation activity^ but there is very little docmnenta- 
tion available on use. Raw data required by the federal government 
is not especially useful by itself* Reports are made available to 
a variety of audiences routinely™f acuity, staffs advisory groups, 
administrationi Staff mCTabers have initiated evaluatldns on free 
time* Evaluators say that good estlmtes of effect of Title I 
programs are impossible so they focus on implementation matters. 
Outside contractor for evaluations in bilingual has been helpful in 
identifying salient questions and clarifying issues* Most of the 
evaluative work is done by the LEA research unit. 

Site E* 

A vigorous research and evaluation unit is responsible for answering 
questions initiated by Title I staffs school board i and administra- 
tion, evaluation staff as well as meeting faderal requirements. The 
latter are regarded as minimal but compatible with interests* Ques- 
tions addressed focus on program operations In the interest of program 
modification, program planning * budget cuts^ and also address "inter-- 
esting questions" when resources permit* Evaluation is seen as useful 
but lack of funds Is regarded as a crucial impediment* Some attempts 
to esttoate effects of programs on clients, cost benefit analysis of 
alternatives s etc* are undertaken. 
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Site F, 



A small evaluation unit whose coverage includes Title I. bilingual 
and special education. Evaluation goes beyond federal requlreaents 
in tne tirst two programs. Ttie results of inonitorlni are used to 
modify programs and to argue for continued support for public rela- 
hf^h/"''^?^^^: °^ claimed by the administration and 

;L ? °" department assistant are far different. The latter 

complained consistently that reconmendatlons were not taken and 
seemed to be Ignored. The evaluation unit reports directly to an 
Assistant Superintendent for Adininistrative Services. 



Site G. 



The research unit is responsible for Title I evaluations and these 

"""^^ted, readily accessible, and appear to generate 
reactions among parents, staff, and administrators. The unit used 
and^no^^h ^°J^° " °" evaluations leading to program modifications 

and nowthat the program has stabilized they spend most time on meet- 
ing minimum requirements. Evaluation is one piece of many entering 
the decision process. Evaluation in remaJning programs Is informaf. 
No documentation on use of evaluations is readily available 



Site J 

Site J has 



Site J has a two member research unit which merely augments evaluations 
undertaken by staff Use of evaluation by program developer with Tit" 
IV-C funds IS remarkable: field tests lead to decisions to replicate 
program and tests in other sites, to dissemination of program, etc. 
Evaluation m voc-ed is informal directed toward filling local needs 
and does appear to be used. The evaluator for Title I reports reau- 
iarly to parents, program administrators, and staff but most sensl- 

stIl£er.""^"^'"?^."''^^"*^ discussed m private with Individual 
scatters. He does believe some of these lead to changes in program 
operations. The evaluations of the Title I program sL years ago 
lead to approval by JDRP and the program itself became the basis for an 
tbAA program partly on account of this evidence. 



Site K. 



ERIC 



There Is an instructional research unit but program staff do their own 
evaluations. Ivaluations In all proframs stress needs assessment. 
Outcome assessments are stressed in bilingual and Title I. They go 
beyond federal requirements in some programs and use information for 
program modification and planning, but lack of time and resources 
prevent more than small incremental troubleshooting. Reports by an 
outside contracted evaluator of bilingual criticized the program and 
were used to change operations and/or staff. 

Site L, 

Any evaluation undertaken is relatively informal. A principal may look 
at tests scores to determine "impact" of Title I and report this to 
the school board. Head counts are a typical form of monitoring. 
Ihare is little emphasis on any formal evaluation of programs, 
federal or otherwise. There are self-evaluations In which teams 
are created to escamne a variety of Internal activities, some federal, 

:n.rof J^^^^^^!^,^^' '^^^ - 
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The major recipients of evaluation unit services are teachers and 
principals. On average ^ 32/^ of unit time is spent in efforts to service 
th&m. The next largest groups serviced are superintendents and board 
membars (20%) and central office staff (17%), The average amount of time 
dedicated to servicing federal needs and state needs is about the same, 
10% or so in each case. 

According to 70% of respondents , the most consistent users of infor= 
mat ion are program directors. A bit over half reported that principals ^ 
and superintendents 5 and central office staff were consistent users* Very 
few (8,5%) said parents used Information consistently and a bit more said 
their school board used it consistently. Uses among these audiences are 
said to be occasional, 

Alkinj Dalllak and White at UCLA reported evaluation utiliEation in 
five case studies of the utilization of evaluations. These were evaluations 
of I (1) a progr^ under ESEA Title IV^-C to improve retention of disruptive 
students, (2) a teaching approach developed through an R & D center which 
had been adopted as part of a compensatory education progran, (3) a new 
component of a Title 1 kindergarten program ^ (4) a career education progrim 
under Title IV=C, and (5) a bilingual program under Title IV-G, Each case 
study described the community and sett tag of the program, the progrM itself, 
the progrm and evaluation administrators , the evaluation, and the types of 
uses to which It was put* In all five cases, at least some use was made of 
the evaluation information to modify programs or to change views about the 
progrOT. In some of the cases, these uses were major, such as the decision 
by the state of Calif omia to dias©sinate bilingual program to other districts* 

Title I Programs and Test Scores 

A recent study conducted by SRI International Investigated local uses 
of Title 1 evaluations required from local districts receiving federal pro« 
gram support. SRI based conclusions on case studies of 15 school districts. 
Their general conclusion was that local districts rarely use evaluative data 
generated primarily for federal reporting (Davldi 1978). 

In particular, SRI found that test scores required under Title I "do 
not serve prtaarily as a means of judging the program. L^cal skills related 
tests and personal observations almost always carry more weight than test 
scores, and goals other than achievement are often of more interest. This 
finding is corroborated for test scores, genreally in an independent series of 
18 case studies by Sproull and Zubrow, Test scores were one of the least- 
Vfiilued sources of information available to central office school district 
adm inistrators . ^ 

But test scores did proylde "gross" Indicators of program effectiveness 
according to Interviewees in the SRI study* Again, this finding is supported 
by the case studies of Sproull and Zubrow. However, the SRI case studies 
indicate that changes In programs are marginal In any case, so that there 
is little opportunity to use the data in making local decisions. 
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th. n ^^°«^f the SRI study were also used to select students Into 
clJta thlfr'hf in the Site visits, despite the 

clain that they are not pertinent for evaluating program performance. 

Finally, test scorea are used to confirm impressions and beliefs, 
so long as results are positive, according to teachers. If results are 
negatxve, according to the SRI study, they are taken as an ann^ya^eeJhich 
must be explained away. The attitudes toward tests Include: 

••The mam purpose of test scores is to support your own beliefs. (Teacher) 
••t look at test scores mainly to confirm my impressions." CTeacher) 

Our own site visits also suggest that test scores alone are not and 
should not be a basis for major local decisions about a program. Most 
ln"^au'm'dlf''%"""'' -^^"^""-s in any case. The Lformatlon used 
StSude ^u^^v"^'"^"'^ incidental observations, and 

attitude survey, teacher or parent criticism, and others. Test scores are 
however, sufficiently Important to concern program managers and teachers ' 
when the scores fail to behave as someone believes they should. 

.Routine Use in JDRP-NDN 

U S S?fiff nf?} ^^^^^^ Network (NDN) . supported since 1974 by the 
LEAS and SPAf «^ Information on innovations available to 

is available f ^hlch sufficient evidence on effectiveness 

LEAS ?n the network ^novation is approved for distribution to other 

ilth "^f ."^twork only if it passes muster on evidentiary and other grounds 

rfltafi orSsora^n?!^^^" '^^^^ '^'^ peLl lncludes^^^bfrs 

H„ 4 ^" principle, the Joint Dissemination Review Panel is a qualltv control 
device and routlnises the use of evaluation findings. The lanel meets 
regularly to review new programs and the evidence for their effectiveness 
It has produced a manual to describe the kinds of evaluative evidence which 
are most acceptable. This Ideabook is readily available.8 ^^^^^ 

.nn "^^^ more than 60 times reviewed over 

300 submissions, of which approximately 55Z were approved." Mrpro^ctr 
which are approved become eligible for diffusion under NDN anfbecome 
eligible to receive Title IV^C funds to assist the project l^elopJr in 

of a".rant"h "''f P^^^^^ then! the award 

R^vief ?anel'"''' -aluatlve evidence used by the Joint Dissemination 

Most federal agency staff interviewed by Northwestern 's Prelect staff 

agreed that the JDRP is good in principle. It routlnl.es use of evaluation 

ore:ide:cr fS^serves'as^'^"^! f'"™^ devellpef abouj «^dirds 

It apJearrto"be ulf so^g^of "^^^ ^P""^-' 
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Respondents 5 although supportive of the JDRP and emphatic about the 
need for such a quality control mechanism ^ indicated several probl^s with 
the JDRP in practice. The first of these was the role played by panel 
monbers' judgments of "educational significance," This criterion played 
an equal role, along with the evaluation evidence, in the decision to fund 
and disseminate a project* "Educational signifance" is a concept that is 
necessarily less clear than hard data that support the effectiveness of a 
program. As a result, there may be more room for the values of the panel 
members and their om political needs to play a role* 

Second, several informants mentioned that submitters did not uniformly 
define the effective elemrats of their innovations, but rather evaluated 
a whole "package" that might contain several el^^ts. Any particular 
elment of a program might never, itself, be evaluated^ and yet adopting 
sites might choose this element out of all others, to adopt, OEB is cur- 
rently evaliaatlng improved methods of ensuring that all such el^ents of 
innovations are evaluated before dissonination* 

Adopting sites are not, at present, unifomly evaluated thanselves. 
Such evaluation la taportant because the generaligabllity of the innova- 
tions is not Imom. Moreover, Implmentation at a new site may not be 
adequate. The toerican Institutes for Research evaluation of the National 
Diffusion Network fomid that most irmovations were modified at the new site. 
More work appears to be needed to determine the extent to which projects 
can be modified before they must be called a differmt project. OED is 
currently evaluating alternative methods of characteriztog the crucial 
eleaents of such projects* 

Consider tag educational innovations more generally, however, local 
and state education agencies use evaluation results in the following sense* 
All state facilitators in the National Diffusion Network rely on review of 
evaluation evidence by the Joint Dissenination and Review Panel in their 
work* The reliance is automatic since any program they disseminate must 
pass a review based partly on evidence before It is eligible for dissemination. 
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6.6 UNDERSTAT^IENT AKD OVERSTATMENT OF USE 

Most federal agencies and Congressional organizatidns acknowledge 
evaluations done by others in some deirea. The case studies in this 
chapter, for instance, rely heavily on documented references to specific 
evaluations and corroborated testimony. 

But Judgments about how often evaluation studies are used at the 
federal, state, or local levels are doubtless less accurate than they 
should be because so many uses are invisible and are not publicly ack- 
nowledged. These uses are evident in letters and memoranda which are not 

i^/"^"^^^^' public. They are Impljcit in public documents 

which do not specify the actual study completely and In informal communis 
cations about an education program. 

At the federal level for example, a response from the Comptroller 
General to a legislator's question may refer to evaluations conducted by 
agencies other than GaO. In particular, a recent letter from Comptroller 
General Staats to Senators Russell Long and Robert Packwood regarding 
federal regulations on day care suggested that the National Day Care Center 
btudy be used in Judging the operation of the program. The Study results 
were indeed used later, in changing regulations and In reconciling a ten-year 
argument between legislators and bureaucrats. Despite the fact that such 
letters are public under GAO rules, and are reproduced and disseminated 
commentators on use of evaluation are often unaware of them. Lacktag In- 
formation about such memoranda means claims about the extent of use will be 
understated. 

_ Our interviewf, with one of three Congressional Budget Office members 
with responsibility for educational programs uncovered a similar phenomenon. 
We were told that letters written in response to a legislator's question 
do refer to evaluations where they are relevant. CBO does not reproduce 
and dissanlnate letters of this sort as GAO does, however. A similar ob- 
stacle to tracing use of information stems from the Congressional Research 
Service practice of developing confidential memos and directed writing In 
response to legislative inquiry. These are rarely. If ever, made public 
Yet some are likely to concern education programs and their evaluation 
We have not had the time in this Project to peruse nonpublic written communi- 
cations by either CBO or CRS, and have not done so. The point is that 
estimates of Information use are likely to be biased downward unless one 
can enploy nonpublic memos. 

The problem of obtaining accurate estimates of the Incidence of use is 
more difficult with unexpected demands for information. In 1979 for 
instance, a TV broadcast of 60 Minutes sttressed the severe problims that a 
school district in Michigan encountered in implraientlng Public Law 94-142 
requiring access to education for the handicapped. The district had evl-' 
t^l '"ff "derstood the intent of the legislation, interpreting it to mean 
that handicapped children had to be mainstreamed. and the broadcast staff 
itself were no mora Imowledgeable . The evaluation staff of the Bureau of 
Education for the Handicapped anticipated questions from Congress being 
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provoked by the show. They smimiariEed pertinent results of an evaluation 
study then underway and conducted a fast informal survey to establish that 
Indeed the misinterpretation was not conmon nor were some of the problMS 
prevalent. The speed of this reaction is not especially cormon among agencies. 
But is rarely documented publicly and this lack of documentation adds another 
obstacle to making fair estimates of the use of information, 

A second broad class of invisible uses cQncerns the cumulative nature 
of the information one may obtain about programs . The single most ta= 
portant obstacle to making such use visible Is the failure in a public 
forum to cite an evaliiation when it did influence decisions. So, for 
instance i 1978-^7^ evalmtions of needs j resources , and the likej by the 
Bureau of Education for the Handicapped do not cite evaluative studies 
undertaken by the GAO on related issues in 1974 and 1976. For the public ^ 
the Congress j and the evaluation coimunity these may then appear to be 
irrelevant when they are not, Stallarlyj evaluations undertake by GAO 
sometimes do not cite earlier studies supported by federal agencies or 
do not cite them thoroughly mough to permit determination of which agencies 
did a decent job* The main point is that the actaiowledgement Is not always 
routine and this makes some evaluations appear to be less pertlnOTt and less 
useful than they actually are. 

In reviewing reports from the Congrass itself j we find some interest 
in giving credit where it Is due. But the lack of uniformity in citation 
makes tracking utility of reports difficult* Of 14 references to evaluations 
in a recent Senate coraaittee Report on appropriations for instancej a clear 
effort was made to acknowledge especially useful studies, and to provide 
rationale for decisions. None cite the particular evaluation in a way which 
permits one to go directly to the source or to recognize the specific source* 
"A DH^ evaluation" is mentioned to describe why an appropriation is madej 
for Instance, with no other specification* Moreover j thyee evaluations which 
are clearly used in other committees are not mentioned at all. This Incon^ 
sistancy may be unavoidable on account of the time pressures and the priority 
which other matters must take. But It does make use of evaluations difficult 
to track, and It provides confusing signals to federal bureaucrats who are 
interested in legislative uses of evaluation. 

Informal J unwritten communications about an evaluation are not un=» 
common. But they are difficult or impossible to exploit retrospectively 
as indicators of the use of evaluation results* So, for instance, inter- 
views with two CRS staff monbers confirm that when they have questions 
about a particular topic they simply phone staff in an e^cecutive agency, 
includliig OED, for the answer or a lead to the answer. Two members of the 
Congressional Budget Office confirmed reliance on telephone conversations 
with OED staff as well and a third preferred other source. Similar evKits 
occur at the local level* Informal and even formal exchanges of Information 
between evaluator and central office staff, program director, and so on are 
often not documented, counted, or otherwise recorded. The absence of this 
sort of Information makes utilisation difficult to track and to estimate* 
It does suggest that relying solely on documented uses Is misleading. 
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A related issue Is that there are uses which according to one Con- 
gressional staffer "percolate up through levels of consciousness." A small 
evaluation may be dona by an acadmic researcher, be picked up by the 
popular press, recognlEed and incorporated into a legislator's or executive's 
thmking and ultimately into a decision. This sort of phenomena may be the 
most cornion of "uses." But it is far too subtle to examine in this report. 

To illustrate how judgments about the Importance of evaluations may 
be misleading, consider Florio's recent survey of Congressional staff. 
Asked to rank the significance of various sources of Information, staffers 
suggested that Educational R&D fell above media, polls, but well below 
parents, local education agencies, and professional associations. One major 
peculiarity h^e Is that a good deal of R & D relies on parents and LEAs 
to begm with. Similarly, though professional organizations are ranked 
^ n R&D. those organizations must obtain their Information somewhere 

and R & D IS a source, at least at times. 

Either overstatanent or understatanant of the frequency with which 
evaluations are used may emerge from Interviews. Overstatement occurs. 
At the local level, we encountered school superintendents who suggested 
they personally used the information but who knew virtually nothing about 
the evaluations generated by their staff. We expect that at least some 
respondents felt compelled to say yes simply because they believed It was 
desirable to say yes. We also encountered respondents who early in an 
interview told us that evaluation was not used and who later told us it was 
The change usually came about when respondents simply talked through what 
happwis during an evaluation or what happens to a report, and identified 
groups or individuals to whom the evaluation was pertinent. 

It is not unreasonable to expect some overstatement of utility from 
evaluators. Because the references to use are often not specified com- 
pletely, permitting little corroboration, it Is also possible. So for 
instance in the Annual Evaluation Report for 1979 we found references to 
33 evaluation reports which were said to have been used. In 17 references 
to use m proposed regulations," "in Congressional action, " and the like 
the reference was not sufficimtly specific to track. On the other hand ' 
some simple uses were left out entirely.- if Congress demands a count and 
It IS supplied, whether or not Congress takes action is Immaterial if the 
information is valid. The problem of estimating incidence of use from agency 
reports such as this is no less difficult than the problan encountered in 
examining Congressional tonmittee reports. The reasons for Imperfect ci- 
tation are similar j little time for doing so. Imperfect Information on use. 
and no formal system for tracking use. 
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6,7 FACTORS INFLUENCING USE OF EVALUATIONS 

Five general factors appear to be Important in use of information, 
including evaluations i Ttoeliness, relevancaj credibility^ interest ^ and 
interpret ability. The task of anticipating decisions is important but far 

less clear. 

In the abstract J these factors aren't especially helpful. One of our 
interviewees j Clark Abt of Abt Associates, put the question of why good 
evaluations are unused more succinctly. He maintains they are not used 
because i 

. the prospective users don't know about thm.^ 

, if the evaluation is known, the user doesn't understand ^ 
the results, 

i if the results are understood, the user doesn't believe 
thm, 

. if results are understood and believeable, the prospective 
user doesn't toow how to use them. 

The process of use is a sequential one, and a gap at any given stage 
means nonuse, Abt's illustrations Btm from his general experience as 
President of Abt Associates and specific cases. The latter dUiclude a 
recent conference on children's television run by the Educational Testing 
Service and attended by representatives of the commercial broadcasting 
industry. He matotains that many representatives were unaware of major 
research in children's television. Some of those who are familiar distrust 
it, partly because they distrust the social scientist. Irrespective of 
trust they argue that they cmnot use it but present no evidence. This 
is despite remarkable experiraces of "Sesame Street," Mr, Roger's Neighborhood," 

Electric Gompany," and other progrms which have been evaluated well, Abt 
maintains that eowardice accounts for some nonuse. For Instance, the Follow- 
Through evaluations suggest that poor progrm variations should not receive 
support. They continue to receive support, however, 

Abt's observations are not inconsistent with our findings. 
Timeliness 

The complaint that information, including evaluations, is not ttoely 
occurred frequently at the federal level, less often at state and local" 
levels. The charge has not been confined to USOE's Office of Evaluation and 
Dissemination, of course. It has been leveled against the U,S. General 
Accomittog Office, Judgtag by Frederick Mosher's recent history of the GAO 
and our conversations with Congressional staffers. The charge is not 
especially well documented. Individuals may say that Information Is not 
timely but whm asked for detail admit that they base the statCTent on one 
or two major examples, rather than an average. The absence of documentation 
does not obviate the points late reports are damaging. 
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flTBt^Vtl^ -^^^ '^^^^^ potential causes of the delays. The 

of a recent report on ut Jt.?f J ^%been time consuming, e.g. clearance 
Delays of Lf ?rthree monthf h evaluations took nearly five months. 

ance'of questi^^naSe r^f^allL^a hf start":r;H f ""'.'^ 
can Involve sJjt months h^f^r-fJ , the start of the study. The process 
Such delays ca^ be devastftin^T'T '° ^"^'^ information collection, 
term onear S ruleff or ^o^h i ^^ort^term studies and formidable to long- 
1978, and it Is too earlv to t l ? f *f ' processes have been created since 
third cause of delay is th. ; ".^^^^ ^^^^ A 

study, m which ail obL« f difficulty of carrying out a complex 

We ar; a'are of f °™arat"eirff ^^'^^'^^ — rdi"8 to schedule, 
the problM. Fl^allv rLfl^^^ to identify reasonable ways to accommodate 
to tksf^ th^^inJo S u le^hi^rcan^KsS^'' ",uire ti.e to digest. 
Identify ways in which results cin be used understood, and to 

At the national 1 aval th^ nh^T-r,^ t^u^*. * ^ 

been leveled again.. -ai;;tL\%\^"g|oE!"ir'LTt;rolo tSt^ 

rarely made aKDllcit Tt ^m^^^^n ^^J-^, ana cna GAO, The charge is 

to see answerS wL^ans^ir it^rf f'"?,f ^'^^'^ questions someone wanted 
some evaluative cues tSns are 'vi^tu.ii ^^S^'^^^e^^Plaint in the sense that 
■ agency is unwillteg or ^nabl" L state'thlf Tr?'' P^^^^^" the 
executed nonetheless" rll tLli. f f^ plainly, but the project is 

staffers seem no1\o,ewiUlnf of able Slake"'"' f ^^""^ Congressional 

.^"=s.i:„~— ^^^^^^^^^^^ 

CrednvL1jt:y 

of thJ^^ "-^^^^^^^y ^^'S^'^ d^J^ini interviews. One aspect 

bL ?or the View that evaluations conducted by staff responsi- 

bv af Lh^ - r*^^" ^""P^y " trustworthy as an evaluation conducted 
by an Independent agent. A second aspect concerns the view that criticism 
Of an evaluation automatically implies that the evaluation Is not credibl^ 
Neither view Is particularly new, of course. The issue emerged In evaluation 
LratL'les "!^-^' . " .""-^V bypass operations. And the alt^native 
strategies for reducing criticism are fairly well understood by researchers 
if not by users of the research: Independent, competent critique, repli! ■ 
cation, and recognising the simple fact that no single study is ever perfect 
"flultor |^^,f""f °f »-°wl«dse that is important. " Trust In the 

S^SSlL and £k "^^r"'"' where the audience is unable to understand the 
evaluation, and lacks the resources for independent review. Neither trust nor 
xntalllgent mistrust can be built Instantjy at any level of govern^en^ Ld 
this is one reason why stability of evaluation units is critical 
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Interest 

Interest in an evaluation activity is a prerec 
useful ^formation. Without that Interastj it is 
any Ijifomatlon at all. The Incantlvas for providlt 
to incentives for using the taformationi if the inq 
can benefits then the likelihood of the infomation 
used increases* 
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Knowing about information, understand 1m| it and how t© 




Once the results of an evaluation are developed ^ the implications of 
the results must be educed. There are few formal procedures yet for doing 

BO* 



Moreover j contractors ^ agency staffs and Congressional staff appear to 
be miKed in their ability to do this, Indeedj some contractors have strong 
reservations about making recoOTnendations based on their work* Simply because 
tline available for evaluations Is not sufficient to assay options and recom- 
mendations well. Regardless of ability , recommendations are almys likely to 
be arguable to warrant several viewpouits, and If different pro^pals are 
mades they need to be s^thesi^ed, I 

Prtated taformatlon is often insufficient for understanding* ifeoreover, 
there is no formal mechanism set up to provide oral reports on ei^tuations 
to mMbers of Congressional staff to assure understanding- Nor is there 
any formal mechanism to present the results of Independent appraisal of the 
evaluation itself. Agency staffers are frustrated by their Inability to 
initiate contact with Congressional staff. Congressional staff are frustrated 
by paperwork burden* 

Apart from the probl^s engendered by the difficulty of constructing 
reconmendat ions J and difficulty of communicating thCT!, the absence of regular 
meetings invites errors in interpretation. For ex^ple^ Carl Wisler of OED 
notes that in 1979 a subcoimnittee staffer on the Senate appropriations committee 
misliiterpreted a flndirtg from the Development Asso^ates' 1978 report on 
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bilingual materials, and recommanded cuts in funding for materials de- 
^eitorS!" """ndMstandtag was later clarified, and funding was 

l^is-ellin Datta has noted that, even among agency staff, mis-statements 
and overgenerallzatlons creep Into eKecutive sunmaries unless therels su" 
fxcient review. For example, one state official mentioned to us tLt a 
of f edeLnroL^f " ^ ^^^Pressed the opinion that the RAND study 

programs do- no? v'"-?"^'"* educational change had showed that Innovative ^ 
undlr^h-,>h the study had shown oome of the conditions 

under which such Drograms could be expected to work, 

by onlv thref o?r, ^^is probl«„ is. The matter was brought up 

by only three or four of our respondents out of some 70 at the federal level 
But It was brought up by thoughtful people. J-eveA. 

wheth!^'^^^^^ no formal mechanisms exists for routine following-up to determine 
^de^Sand' T f ^" ^""^ ^"^^ Information Is Leful fof 

understanding how often recommendations have been considered, accepted or 

lliel'lo bfado "^'^ "^'^ recomaendatlons'ale'moL 

agency perfor^Se ' "^^-"-ly^ for understanding Congressional and 



Decision Options, Releva nce, and Abil ity to Use Informat-inn. 

In principle, one ought to be able to outline each possible maior 
outcome of m evaluation and the decisions that one could make based on 
each outcome scenario. Specifying the range of possible decisionshe^ps 

ation r^r^'' ^^^^ one.could use the results of eS?u- 

ation. It is a vehicle for tracking utilization after the evaluation Is 
complete and decisions are made. evaiuacion is 

because-^'fa^^^^' M?r"^f"«" of decisions is difficult or Impossible 

or (bfthe tlf^ ? T^T^^ °' evaluat'ion are not specified in advLce. 
or (b) the time is insufficient to specify decision options, or (c) the 

Js col, be specified well before thfiiformt ion 

IS collected, or (d) nobody is willing to specify options. 

We have found few formal attempts to lay out decision options for 
any major evaluation, any federal or Congressional group. The exceptions 
include so-called evaluabllity assessments which do "try to address t^e 
question of how the information wHl be used once it has been obtained 
"t^^r ^h-t enjoy a long planning period 

to c2rifTcon«rL!i ^^'^^^'^ Education Study staff took'sL months 
to clarify Congressional interests and there was at least some attempt to 

Se'LL ^ T "".f ^="y - P«P-) how such information could bHsed 
a^ i:to°f / specifying decisions within an agency is feasible at least 
Ren^„^ .""^f^-"? evaluabllity assessments which have been undertaken. 
Recent studies by GAO however suggest that it is all but Impossible to 
complete evaluabllity assessments for broad aim programs because nroEram 
Sir and naLSe'"' — f-ce is that evaluation golls ca^nSfb: 

clear and neither can one specify possible outcomes and decisions clearly. 
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Regardless of whether decision options can be specified^ there are 
often strong disagreemaits about what any given piece of reliable infor- 
mation implies. Con aider » for examplep a program found to have failed 
on most coTOts in meeting its objectives. At least one camp withto the 
federal executive branch takes the position that more money ought to be 
put into the program then to make it succeed. The public wanted its it 
should be made to succeed* A second camp will argue for its termination 
because it fail^. St HI a third camp will make the decision one way if 
the program is a demonstration project and the other if It was created as 
a service program. The point to recognise is that either decision is 
legitim ate and that this does not imply the evaluation results are useless . 
The evaluation Inforas the decision ^ but the decision Itself imist be 
based on other values or theories of what one ought to do In the face of 
failure. 



The buslnesa of specifying decisions is complicated as well because 
alternative failure and success are typically mixed. A program may foster 
reading ability, and also impede arithmetic ability* Children may learn 
no more, but parenta may learn a lot. And so on. This complexity should 
not, we believe, impede attanpts to specify outcomes or decision options* 
It Is a persistent difficulty, and It's doubtful that we will make much 
headway by lgnQr:^g It. 



Finally, Innovative social programs do not succeed dramatically as 
often as we would like, ^ticlpatlng for small rather than large advances 
is prudent, and one can do so by es^lolting high quality evaluation designs. 
Moreover, It is not unreasonable to prepare for failure of ^novation by 
specifying what else will be tried If the current effort fails. Such 
contingency planning Is a mundane exercise, but it is difficult to find more 
than honorific attention to the matter outside planned rasearch and da^ 
velopment efforts. 
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6.8 ILLUSTMTIVE CASE STUDIES 

Simply asking an Individual the question, "P^m the results of evaluation 
used?" is a miserably Inadequate device for understanding utlliEatlon* Flaws 
in recollection, the difficulty of verifying statCTents, fragmentary evldenca, 
occasional deliberate deception, and similar problOTS argue for a case history 
approach rather than stople inquiry If the idea is to understand how evalua- 
tions are us^. The problems of understanding use from self -reports are 
similar to those encountered in economic research on ^ploj^iait, medical re- 
search on use of health services, and the like. Consequently ^ we undertook 
detailed case studies of several evaluations* 

We have two overlapping samples of studies i one which consists of 
studies completed by the Office of Ivaluatlon and Diss^ination in 1979 and 
1978, and a second sample selected purposively. In the purposive smple, we 
asked federal agency staff mmbers. Congressional staffers, and contractors 
to identify r^arkable evaluations which had been used over the past three 
years. Then we tried to determine whether their contention about use was 
supported by evidence* The same strategy was used to Identify Interesting 
cases in the views of local and state staff. These studies thm are illustra- 
tive^ intended to complonent the general description given earlier. The 
stress is on uses by the Congress, by managCTient, and by oversight groups. 
Special efforts were made to secure docimentary evidence* Most of the work 
of corroboration ras done by telephone and mall* To facilitate Independent 
verification of the evidence, we have included references to published docu- 
ments and have aclmowledged individuals who provided information. 

The OED sample involved choosing those studies from the highlights of the 
Annual Evaluation Reports, for FY 1979 and 1978, that appeared from their 
description to involve evaluation* Studies of management per se were eKcluded, 
as were studies of economic projections* 
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Sample of Studies from Annual Evaluation Reports 
1978 and 1979 

Purposive Sample ; 

ASPE- Study of Impact Aid (ta house) 

Exploratory Evaluation of Follow Through (in house) 
Fund for the Improvement of Postsecondary Education (NTS) 

OEi Federal Program Supporting Educational Change (1977) (Band) 
Evaluation of Bilingual Education Projects under Title VII 
Evaluatlou of Follow Through Planned Variations (Abt) 
Sustaining Effects Study (Systems Development Corporation) 

Compensatory Education Studies (Paul HHI et al) 
Title I testing (SRI) 



(AIR) 



^ NIE 



ACYF: National Day Care Study (Abt) 

Otheri Uses by Congressional Budget Office 
Uses by Parent Advisory Cominittees 
Uses by Providence, Rl School District 



OED Sample 

1979 

Title I services to neglected and dellnquant youth (Systems Development Corp.) 
Study of magnet schools (Abt Associates) 

Evaluation of Project Implementation Packages (American Institutes for Research) 
Study of Campus-Based Aid and Basic Grant Programs (Applied Management Sci.) 
Sex equity in vocational education (American Institutes for Research) 
ESAA-TV survey of viewers (Applied Management Sciences) 
Sustaining effects study (Systems Development Corporation) 



1978 



Evaluation of bilingual education projects under Title VII (AIR) 
Evaluation of Follow Through Planned Variations (Abt) 
ESAA aid to non-profit organizations (Rand Corporation) 
Upward Bound CResearch Triangle Institute) 

Indian Education Part A (Coiranunicationg Technology Corporation) 
Exanplary programs in career education (.AIR) 
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Case Study on the Use of Evaluationi 
The NIE CQmpensatory Education Study 

The National Institute of Education was mandated by Gongresa under 
Public Law 93-380, the Education Amenanents of 1974, to examine "the 
fundamental purposes of [comp^satory education] programs and the effect-* 
iveness of such programs in attaintog auch purpoaea*'* te, Paul Hill* 
director of the staff, worked with the administrative support of NIE which 
was required to report directly to Congress rather than through DHEW, 

Congress wanted information on two broad subjects i the preaent oper- 
ation of Title I It federal^ atate, and local levels* and the probable 
effects of changes In Title I legislation. The study staff selected 6 major 
topics, in consultation with Congressional staff, to be focused on in their 
report: delivery of Title I services, state and local administration of 
Title I, funds allocation methods, and effectiveness of varioua inatructional 
techniques , 

Evidence of the usefulness of the study Is abundant. The House Coranlttee 
on Education and Labor Is eKpllclti "The Comilttee has found the quality of 
the reaearch by NIE to be excellent and has consequently relied upon these 
reports In fomulatlng ^endmenta to Title I." (House report on HR 15, p* 5) 
And from the Senate Conmilttee on Himan Resources, we have: "The eonmlttee 
wishes to comend the National Institute of Education for the unifomly high 
quality of Ita study, as well aa Its tuielliiess* aa it proved Invaluable to 
the coimittee In the formulation of the Education Mendmmta of 1978." 
(Senate report on S. 1753* p, 7) The Houae Coimittee used the NIE report 
to organise the subject matter of the hearings on Title I. 

The Senate and House reports cite findings as justification for the 
form and content of several Title I amendments. On occasloni the findings 
are ao distinctive that their contribution to Miendments admits of little 
doubt* The actions* statOTenta, and proposed ammdmenta that accompany ci- 
tations of NIE findings are su^arlzed in the following: 

Effects of Title I on Recipients 

Both the Houae and Senate reports explicitly acknowledge the study's 
positive findings* that servicea are delivered to appropriate children, that 
the program contributes to educational experl^ces of these children, and 
that Title I does enhance student achievOTent In diatrlcta In which the 
progrM is stable and well implmented. The House report asserts thatl 
"All these ftodings can be contrasted with earlier studiea which showed that 
disadvantaged students fall more and more behind in their achlevment levels 
and become tocreasingly pessimistic about their ability to improve through 
education*" (p* 7) 

The Study then changed or reinforced some attitudes toward the effect- 
iveness of compensatory education* a major topact of the information In and 
of Itself, The House report used this Information in its argument that Title I 
be reauthorised for another five years and that increaaed fundtog Is warranted*** 
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Funds Allocation 

Congress had altered the formula for allocation in 1974. such that 
only two thirds of AFDC children could be counted In a district's appli- 
cation for funds. Both the House and Senate reports cited the Nil flndlni 
that only 6.9% of participants qualified for Title 1 under the AFDC measure 
Alnr°"-^f^ ^° "sure Congress had been given to 1974. Moreover, 

LlT^^^f^^ unevenly distributed, with urban areas relying heavily 

on AFDC ellgibles for Title I funding. These considerations led the committee 
to restore a full count of AFDC children to the formula for funds allocation. 
The proposed amendment became part of PL 95-561 (Title I. Part A, section 112). 

The House report noted the Study's examtoatlon of alternative poverty 
formulas that would recognlEe needy individuals not currently eligible for 
t^A^ lu I alternatives would Increase funding for the Northeastern 
and North Central states, while the share of the South would decrease slight- 
ly and the West would be unaffected. It was clear that the Coranittee had 
seriously considered the toformation presented, b-it they noted that no one 
indeK of poverty would perfectly capture the distribution of needy Individuals. 
The Committee therefore proposed, among other amendments, that all funds up 
to the 1979 level of appropriations would be distributed according to the 
present index. Thess proposals subsequently became law (Title I, Part A 
Section 111, paragraph 2D). 

The House report cited the Study's evidence that states differ In their 
criteria for allocating funds within counties to particular school districts 
Because school districts are not cotermtoous with counties, the process of 
subcounty allocations can become very complex. The Study also found that 
several states pool grants for all counties and give funds to districts 
based on the states' own eligibility formulas. NIB concluded, "this practice 
violates the basic Title I statute and the regulations, but it could produce 
results wholly consistent with the intent of both." "Consequently" (p. 12) 
the House Co^ittee proposed an amendment permitting States to allocate money 
directly to districts, ignoring county allocations. The Senate report pro- 
posed a similar amendment, and It became law (Title I, Part A. Section 111 
(a; 2B} , , , 

The Study staff discovered that Title I funds per participating child 
were lowest m poor rural districts, because these districts have high con- 
t^fl^T °^^^f children, and because they are located primarily in states 
with the lowest education expenditures. Low levels of fundtog for general 

''^S" ^ efforts. This information to conjunction 

llm ° t'fiinony was presented as the rationale for an amendment authorlztog 
$AO0 million for supplemental grants to districts to counties with high con- 
""^f children. The Senate bill also contains this provision, 

% S P^°P"ed it. It subsequently became law 

(Title I, Part A, Section 117). 
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The Study staff discovered that strong pressures eKlst within LEAs 
to increase the number of schools receiving Title I, The goal of con- 
centrating aid on schools with the lowest tocome was not betog met. NlE*s 
finding that most denonstration districts chose to allocate funds to schools 
on the basis of student achievement criteria was noted* In expressing con- 
cern that fundi might be diluted* the House report noted NlE's conclusion 
that d^dnstratlon districts had extended services to more students without 
reducing the intensity and quality of services , but they could not do so for 
long without increased fund tog. 

Citing this and other evidence^ the House and Senate bills proposed 
several actions- First, LEAs would be authorized to rank schools both 
according to poverty and educational deprivation. Poverty schools would be 
served in the order of their rank unless a school ranked lower in terms of 
the number of educationally deprived children. Second, regulations would 
be created to allow LEAa to serve schools ranked lower in terms of edu- 
cational .deprivation than schools ranked higher in terms o£ poverty. Third, 
LEA would be permitted to skip schools receiving special state or local 
services similar to Title I and like amounts* Finally, the bill required 
that once a measure of rankljig is chosen by a district | it must be uniformly 
applied. These proposals became law (Title I, Part A, Section 122 and Section 
123, (d)). 



Services Delivered to Students 

The Study found that less than 1% of secondary school students receive 
Title I services compared to 20% in elementary school. The House and Senate 
reports cited this finding and noted that many districts hesitate to adopt 
high school progrMis because they do not taiow the program types that wuld 
be legal. Both Senate and House Conmittees noted that the Commissioner 
should uiclude in regulations legal models for issues that might arise due 
to use of Title I funds in elmentary and secondary schools. 

Both the Senate and House reports cite the Study's conclusions that, 
while Title 1 does not require or encourage particular ijistructional strategies, 
some state and local officials believe that HEW auditors prefer the **pull out'* 
design. Both Senate and House Coimnittees direct OE to develop regulations 
describing both "in class" and "pull put" designs for Title I admtoistrators. 

Both House and Senate reports cited the Study find tog that a quarter of 
Title I studmts are assigned to homerooms exclusively for Title I students, 
and may be separated from higher achieving students for the entire school day. 
Both reports mph^sige that Title I does not totend such segregation, which 
should be avoided. If it does occur, programs must show that Title I children 
are receiving their fair share of state and locally funded services. 

The Study data revealed that parental Involvement activities by 1976 
had accounted for the largest expenditures for Title I auxiliary services, 
that a third of all districts surveyed had no functioning chairperson for a 
school advisory coimcil, and that one fourth of districts had no council at 
all. Citing this and other information, the House bill revised the requirments 
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f^^f "f .^f councils. Tu^o of these provisions became law. A 

Sectlor??tf°\Tf ^^^T" ^ ""^^ lagislation (Title 1, Part A. 
Section 125). Nil was also directed to conduct a study of the effects of 
parental tovolvement in Title 1 programs (Section 125)! 

the dJ^.fff- ""f "'^^^ "^he "supplement, not supplant" provision for 

for distrlet«^°%f f ^""^^ P^ovUM sufficient' flexibility 

not «nf*rf ;i concluded that regulations promulgated by OE were 

not sufficiently comprehensive, because some of the tests for coaollance 
needs^of 1'/°"'' '""^ "Sulations and the regulations did noftaS the 
Mtt tt ^^^"trators into account. Both the House and Senate reports 
1^ ^^^J^V^^ ^""^ "quired that testa of compliance on tht, supplanting 
issue be published in the Federal Register . Moreover. OE regulations were 
hof th« ft legal models fSr non^-supplantlng of funds anfto Spla in 

how these principles apply In day to day situations. »«PA-in 

was nece«fiv*i-''° ^^^"^ ^tudy concluded that an amendment 

was necessary to encourage states and school districts to give compensatory 

rM ^^L"""^^"" °f services that are locally fmded 

establishliig compensatory education services similar to those of Title I 
^amendment for such comparability was Introducad and later became law * 
(Title I, Part A. sections 126 (c) and 131). 

1. ll^^ l^^^^ ^""^ reports cite evidence from the Study that 

.omL 2 °5 P^^^^^ school students In Title I districts are served by 
m^v n^rK""" f services. The Study suggested that private schools 

may not be made aware of availability of Title 1 services or that public 
school officials may not design the services to take into account the needs 
of private school children. Both House and Senate Committees Introduced 
two amendments to ameliorate this problan. First, equal expmdltures for 
^li^^TJ^J^^^^^ ^^^""^^ ^« required, although the number of 

ro™^,-^? - ^P"^^ "'^'^^ to be considered. Second the 

au^orltf """^ Commissioner be required to exercise his bypass 

wltf . ? ^"f " P'^^Pt "solution of complaints of private schools dealing 
with Title I. In addition, the Committees urged strengthening of OE regu- 
lations to ensure that private schools become aware of the availability of 
services to their Title I eligible students. These proposals subsequently 
became law (Title I, Part A. section 130). ^ 

According to state administrators interviewed In the Study, most problems 
in coordinating Title I and state^funded compensatory education were due to 
a lack of clarity m interpretations and guidelines Issued by Federal Title I 
^heS^fS^i^"' °f '° inconsistencies m Federal monitoring and enforcement. 
These findings, along with other testimony, led the House and Senate Committees 
from ?lt1e ^^^^^ ^^^"^^ ^""^ requirements for obtaining an exemption 

1 \ comparability provisions. A new exanption 

would apply to state programs that are being phased in. Another proposal 
required that OE or SEAs determine in advance whether special state or local 
programs satisfy the requirements for axanptlon. These amendments sub= 
sequently became law (Title 1, part A. Section 131). 



243 



6-40 



Both the Senate and House CoOTnitteas concludad that when schoolg have 
a very high percer taga gf low iiicome chlldran, school-wide pro j acta to 
serve tham were ^ order. This CQnclusion was based in part on evidence 
from the Lagal S: mdards Project^ supported by the Study, that in such 
schools it is difficult to design "special" programs for this large majority 
which do not also sarva the school as a whole. Both the House and Senata 
CoMaittaas therafora proposed an amandment parmitting schools with high 
concentrations of children from low Ijicome families to use state, local and 
federal funds to design a single compensatory education program for all 
children, A revision of the propofjal became law (Titla I, Part A, section 



State Administration 

The Study eoncluded that many statas are not clear as to their authority 
to rulMaking, disseminating Information, providing taclmical assistanca 
and monitoring complianca. This confusion was attributed to ambiguity and 
difficulty of legislation and Inconsistency of federal monitoring* Moreover, 
the legal framework for Title I has led soma states to practices that are not 
in compliance. The House report clarified rulanaking language in proposals 
which subsequently bacame law (Titla I, Part section 165)* The Senate 
report discussed these findings undar thair proposed Title and the rule- 
making amendment bacame law (Title V, Part A, Section 504). The House 
Committee also directed OE to clarify regulations and insure that states 
do not mlslnterprat their authority, 

NIE found that states tmd implied authority to withhold funds, but 
that the manner iji which they did so was "quite inconaistent Citing this 
finding, the House report clarified language dealing with the manner in 
which states are to withhold funds. This proposal subsequ^tly became law 
(Title I, Part C, section 186), The Senata report cited these findings 
under thair proposed Title V, and the withholding amendment hare bacame 
law also (Title V, Part A, Section 508) * 

The House report noted the NIE finding that Title I regulations dealing 
with state enforcement are scattered throughout the legal framework for 
Title I, thus impeding Congressional intent. The Committee urged the Com- 
missioner to revise regulations for state enforcement such that they facili- 
tated compliance and described options and the legal basis for sanctions. 
In addition^ both the Senate and House bills included an Bmmndm^t requiring 
states to submit monitoring and enforcement plms. This proposal subsequently 
became law (Title I, Part C, Section 171), 



NIE concluded that most states require fiscal audits, as stated to 
their policies, but omit compliance audits. Moreover ^ audit resolution 
varies widely among states. Citing these findings, the House report totro- 
ducad an amandment clarifying auditing and audit resolution responsibilities 
of state education agencies. This amendment subsequTOtly became law (Title 1 
Part C, section 170). The Senata report discussed these ftodtogs under 
their proposed Title V, and the auditing amendment becamse law (Title V, 
Part A, Section 509), 
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sponaS^ft^Olh S ""'^^^ ^"'^^'"^ effactlve their re- 

iTreased T^e administrative set-aside portion of state funds were 

cation If; . ^"^^ ^^^^ testtoony as Justlfl- 

report flLrSrs'f? f^" '"'""^ administration. Thf Se^ta 

IS^te bill co^solida^efftata fd^l'^f ^ ""^'^^ ^PP"-^' 
- new Tltl. V ofsSEf Thf^^f ^'"f ^ ^"'^ "tie IV under 

could be dLlnlihi and Wh ^ for consolidation was that paperwork 

propoealsluJ^eJuStlf bee^/^^^^ efficiently. These 



Adffiinlgtratlon 



that ?"je'l"„fa'";e„%ranv°:fl\"r"? conclusion 

- xuie i waa generally well-administared at the faderaT I^m^i tu^ 

±w wi uim program. Accord me to the i^anri-F^ ni* - ^ 

raquiremants rlB^r^i^ ar.^ ^« - ^ ? repprt, OE did not Mplement 

slstent standards in identifying violations of " ^ " " ^PP^^ 
requiremants. The Study alsHLnrthat i suppliant, not supplant- 

not written clearlv ALni?L ^ f 1=^8 pertaining to Title I are 

money where necessarv nn a i-f i ^ - ^ audits and to recover 

each'^stap of and" rLoLtlL ret^ • "^'^^^ directed to dascriba 

to congress on auSlts Thifi ^"/«8"l*tions and to provide an annual report 
Part d! section ^85) subsequently became law (Tltli I. 

develSed^fbldrof'Ste^prSa^L ''"''"^ ^^^^-Sh OE had 

this body of exLrLnce hafnot h^' guidelines and applications of regulations, 
semmated to aL^^rSd locfl agenc iL^'Se L "^^^^'^f ^ "S^^^er nor dis= 
an amendment directing nw t ^Sencies. The House and Senate bills included 

local ageJcles^'ln^^ ^p nd^^lh^ Nlil^rrof ad'^^^f ""^^ 
recommendations for toDrov.'n^^L f report on administration Includes 

of these rec^e^datJon"::"! IncSdLln'Jhe "J l^f^l ork. Some 
pressed support for the^ t \, '®P°" ^"'^ ^^e Committees ex= 

(Title I, Part d! sectS; amendment subsequently became law 
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General Provision^ 

The House report noted that problems existed with current methods 
of demonstrating comparability between Title 1 services and those provided 
by state and local funds* It cited NIE*s finding that the current method, 
using the non-Title I average, could lead to inequalities for Title I schools. 
The House therefore intraduced an amendment allowing 01 to waive this require- 
ment in order to allow selected school districts to try alternative methods 
of demonstrating comparability. This became law (Title I, Part section 102). 
Testimony other than the NIE findings also was cited as Justification in the 
House report. 

The House report cited the finding that although procedures to set 
indirect cost rates were generally clearj there were two exceptional cases 
of a lack of clarity in regulations. The committee directed OE to clarify 
these regulations, which were leading to variations among states in setting 
rates in indirect costs for state administrative set -asides and for Title I 
programs for handicapped and neglected and delinquent children. 

This listing is merely a content analysis of statements in the record 
of the two Conmiittee reports. While it is not possible to establish a 
direct link between Coinmittee action and the NIE Study findings iii all cases^ 
all the evidence points to the conclusion that the findings contributed greatly 
to the form of the bills* The NIE Study was highly useful to the sense that 
it contributed to at least 21 separate sections of the 1978 Mendments relating 
to Title I and Title V. Furthermore, the House committae directed the re-- 
vision or addition of regulatigns to clarify procedures and policy in six 
other areas of Title I administration and funding. 

In developiTLg this case study, Paul Hill, former director of the 
Study ^ and Iris Rotberg, a co=principal investigator, were interviewed* 
Pertinent information was provided by Jack Jennings of the Congressional 
staff and Chris Cross, a former staff member. 

In developing this case study. Iris Rotbergs deputy director of the 
study, was interviewed. Partinent information was provided by Jack Jennings 
of the Congressional staff and Chris Cross, a former staff member. 
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Case Study! Evaluation 
of Title VII Bilingual Programs 

American Institutes for Rasearch conducted an evaluation of the 
Title VII Bilingual Prograns froin 1975 to 1977. From 38 sites of bilin- 
gual programs in their fourth or fifth year of operation, classrooms were 
randomly selected from grades 2 throuih 6. Non=biliniual classrooms with 
comparable students were selected in or near the districts incorporating 
the bilingual classrooms. Children were tested for English comprehanslon 
and reading, mathematics, Spanish oral coinprehenslon and reading, and 
attitudes. The evaluation was supported by 01 's Office of Evaluation and 
Dissemination. 

An interim report was submitted to Office of Education for the reau- 
thorization hearings of ESEA In 1978. The findings included: 



1) 



Hispanic students not In Title VIX classes outperformed 
Hispanic students In such classes In English proficiency 
(with some variation across grade levels). 



4) 



2) Hispanic students In Title VII outperformed non-Title VII 
students in mathematics. (In the final report, however, 
additional data and a new analysis showed that Title VIl 
and non-Title Vll students appear to do equally well in 
math, 

3) Less than 1/3 of students in Title VIl programs were of 
limited English speaking ability and generally did not 
exit the program vrtien they became proficient. 

Title VII Hispanic students had higher Spanish proficiency 
than did the non-Title VII comparison group, though the 
Title VII progran may not have bean the only contributing 
factor. 

These findings appear to have been used by Congress to alter the 
Title VII program, Judging from reports of the House Comnittee on Educa- 
tion and Labor, the Senate Committee on Labor and Human Resources, and 
other groups influencing the 1978 Education Mandments . 

CongreBslonal Use 

The House re,port first cited the AIR findings, the controversy they 
produced, and the critique by NIE of the adequacy of the information 
In spite of uncertainty' over the findings, enough information existed 
from other sources to justify certain amendments. 

First, the AIR study showed the inadequacy of the current definition 
uJ. ^".^t^ population. The definition, as 1978, equated speaking 
ability with competency In English, omitting understanding, reading and 
writing. The AIR measures were English reading and understanding. The 
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new House b 
Senate repor 



btoadenad, this definition to include thfise concepts. The 

(Title VIT of«PT q^°?fi^*^f ^ broadened definition, which became law 

\.iicAe VLl ofTPL 95-561, Section 703). 



fh^ J^t further spe'clfled the students to be served by stipulatlne 

E^tlill ^""°8 Chat many bilingual classrooms contained a majority of 
eS!h "^^^ ^^"^ the consideration that English and nL 
House JhefeL " "^^""^ interaction in the classroom. Ihe 

ll^t ^h^'*°" proposed that for "pull-out" bilingual programs all 

number :fL"?rf''''?/P^"''"8' "8"^" classroomsr hat "he 

school -T^f^^^^f "* proportional to their numbers in 

the q^nZi^i ; t classroom be Enillsh-speaklng. Although 

SeLt^ f • "'^"•^ rationale, the AIR report was not cited The 

Senate provision became law CPL^95=561, Title Vli; Section 703). 

cordi^/tf 11^"/^'' come 'about because of the AIR study, ac- 

dllttl '° J °* °T interviewees, namely, the requirements that OE 
and that'^^F f identifying children of limited English ability. 

avali^tlon of"Se°' ^"-^^S^^^ education and moLls for the 

evaiuacion ot thesa programs, 

"""85? of project directors said that students did not 
"S^lnf thl H^" '° proficient In Ensllsh. Citing this 

t.at a^^L%- Lf 

^oH-^rSnJ"' "«ril ^ro°Sa?:^Sr^"^: .s' raJS-^als 

ror determining when those children i"' * *** s ^j-b 

language was strengthened to Indicate 
goal of the program. , This became 1^ 



Longer ( 



The House bill' contained a gene 
bilingual education be limited to 5 
waivers for special circumstances. 
that bilingual programs are expensive t 
ther are established. The AIR report wastcited, s 
for Title VII billnguar programs, compa^ to non-bni 
serving Hispanic children. The flve-ye^rule betiamfe. 



1 --^ • 

Itart j 



House report citad AIR avldenca 
teachers in their study had a ragular or 
thare was great variation ±n standards f 
Instructor, The House required OE to con 





assistance,'- Thus 
proficiency ^ms the 



deyal assistance for 
TOjact, with ' 
his provision^ wa 
not be so c 
higher costs 
1 programs 
W (Section IW 



t, although 80 percent of 
ingual teaching credential , 
elng cartlfled a bilingual 
t a ^udy of the impact of 



teaching fellovshlps In 

This amendment became law (Section 723). 




The evaluation has also been used by Congressional support 



staff. 
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See the casa study on the Congressional Budget Office, 

Management and Ebcecutlve Uses 

After the Amendmants of 1978 passed, OE contracted with the South- 
west Regional Laboratory (BWRL) to identify entry and exit criteria for 
bilingual students. According to several of our interviewees, this con« 
tract was a direct result of the AIR findings on the number of English-- 
proficient students retained In bilingual classes. 

Secretary Califano was cited in Education Daily for January 23, 1979, 
as saying that ''the administration will 'make it "clear ' to schools re- 
ceiving bilingual money that the law mandates that children must 'learn 
English as rapidly as possible' and be taught other subjects in their 
natxve language only until they are proficient in English." Education 
Daily also quoted Ernest Boyer, OE Commissioner, as saying '*the Education 
Amendments of 1978, PL 95^561, 'makes it explicit' that bilingual educa^ 
tlon la to be an English language-development program." Moreover, the 
Secretary directed OE to require that bilingual clasaas consist of 15% 
or more students of Italted English ability* 

State Use 

The California State legislature used the AIR study. Assemblj^n 
Dennis >Iangers introduced a bill (AB 2400) that would have required that 
districts use nmltlple criteria to assess whether a student was ready to 
return to English classrooms. According to his legislative aide, the 
AIR study reinforced his feeltag and other Information that many pupils 
are not so limited in English that they could not benefit from English 
classes* 

Judicial Use 

The AIR study was also used in Cintron vs* the Brentwood School 
District in tha State of New York. In this case, the Puerto Rican Legal 
Defense Fund sued the school district of Brentwood to gain more funds 
for bilingual education. The principal investigator ^ -he AIR study 
ras subpoenaed to deicribe the AIR findings. The couil determined that 
the Brentwood district was operating a Spanish language maintenance 
program with no exit criteria for English proficient students. The 
court ruled that the school must have criteria wherebv students judged 
proficient in English would leave the bilingual clasbroom. This case 
preceded the Education ^mendmmts of 1978, Although the Brentwood 
district not in the AIR sample. Information from the AIR study 
allowed the court to ascertain ways in which the bilingual program 
could be improved. 

Publicity and Controversy 

The AIR study generated a great deal of controversy and publicity. 
Even before the hearings on Title VII, several articles on the study had 
appeared in the press. This trend contJjiues, with the study being cited 
on Prime Time Saturday , a CBS TV national broadcast, on May 12, 1980. 

The study had many difficulties along the way, OE was taken to 
court after the contract was let, which used valuable time. The second 
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«"uq"l'lha1tu%'\'""i""^ °f b-dgat restraints. NIE 

and,1t Last " tL^^naJf '^a Alf f^^'^ conscientious 

example In which a crl^l. ^ ^ ^ ^ principal Investigator cited one 

fade jrpi:^^Li%:^rv-«^J^^ ^ P^t because of the evidence that 
Although local diatrlcta are XertH^nrt ^^"^"^se maintenance programs. 
Sovemment will only fund oro«.^, " k "^^"""^"=8 programs, the federal 
quisition Of toglish anfeventu T% . ^""^ " '''^'^ ^^^^V the ac- 

Classrooms. Tt^T^l^f^l^l^'-'^^^^ t° English language 

luage as being relegated to a L'^n^ J "8"^ ^he native lan= 

grams would Jon 1 eld fewLTiUnL'L' l'"" ^'^^^^ transitional pro= 
reasons the bil^nBual I "iJ-ingual classes m some cases. For these 

bil.ngual conmiunity attacked the findings vigorously. 

ml inveftl^tofforlhellR interviewed ttalcolm Danoff . princi^ 

programs, ^r^^2^ iSislSj!"'","' ^" bilingual eduLtlon 

fornia State Assembly Ld lett n; J f °f the Cali= 

Us. former Bcecutlve'neoutv ? • * '^^'^^ of OED. John El- 

ecutive Deputy Commissioner of Education, was also int 



er viewed, 
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Dolicv' °- educational program evaluation for federal 

^Sf;.-L=- ^^^^^^^^^^^^ 
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R«earch, March 1978 (ERIc JlporfLls^ieS^' zs'pp. 
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sJSaSnTlaioJF^^T^ir^^ congress, Second 

Printing Office, May 11, 1978. ^'S- Government 

Senate, Coraittee on Human Resources. Report- Th^ Prfnr-^n „ a j 
oents of 1978 S ra^^u ^epori-. ine Edur ation Amen d- 
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Case Study of the Use of Evaluation 
The Fund for the ImprovCTent of Post secondary Education 

In 1978-795 the NTS Research Corporation conducted an evaluation of 
the Fund for the Improvement of Postsecondary Education, a program then 
authorised under Section 404 of the General Educational Provisions Act. 
The evaluation was supported through a contract between the Office of the 
Assistant Secretary for Planning and Evaluation (ASPE) , DHEW and the Cor- 
poration. 

In testimony before the Subconmittee on Postsacondary Education^ the 
director of the Study Sol Pelavln, concluded that Che Fund was "extremely 
successful, , ,when judged by any number of criteria," He stressed that 
such posit Ive and relatively unqualified Judgments about the wrth of a 
program were relatively rare, making the Fund a rraarkable exception. 

The evidence that this evaluation was used stans from recognition of the 
evaluation in the Report of the House Comiittee on Education and Labor and 
from corroborative testimony of the contractor , the federal agency project monitor 
and a Congressional staff member with respqtisibility for obtaining information 
about the Fund, The Conmittee Report , for instance, quotes Pelavin's testi- 
mony in justifying the "Trmsfer of the Fund from Section 404 of the general 
Education Provisions Act to Title X of the Higher Education Act to give the 
legislative visibility deOTed appropriate by the Committee, (p. 56) The 
evaluation findijig that the Fund was successful :Ln meeting its objectives 
and Congressional intent along a niraber of dimensions also appears to have 
been used to justify authorizing an increased level of funding* 

Information about the use of the evaluation was obtained from Keith Baker, 
a former staff mCTiber of ASPE ±a the Division of Education, Sol Pelavin of NTS 
Corporation, and Thomas Wolanlji, Staff Director of the Subconmittee on Post- 
secondary Education, Committee on Education and Labor. 
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tose Study on the Use of Evaluations 
The Congressipnal Budget Office 



prinarj SsponJlbUitJef l""'"'/' Congressional Budget Office whose 
F u F ^ educational budget pollcv Thev each af- 

results o^^oJ J J"""^ m quality of the evaluations preclude CBO^ using 

developmeL of . H^'- ''f ^^^^"^ "^^^ "'^^ ^ Instance Involve^ 

provided H understanding of the degree to which program services are 

5f U.I i whether the program has detectable effects, and costs. "Je Lde^ 
in CBO rep^S^tolhelSn^"'' ""^^^ of evaluaSon^ 

.he use"re:il^,^S-:n-y LSp^L?"""" ^ ""^^ ^^^""^ 

evaluatlorofthf f ' -^T ^'""^ °^ Valuation and Dissemination's 

rnni^ V!-- f ^^"-^ ^^^'^^ program was used In CBO's report to the 

^>.3i- j-^ugxe insticuce. The specific use is in undarstandine 

llkellhoorthff tJ" -«««s, notably on inf luenc^J tl^ 

no detectable^ef f .""^^ ''"^ P"""^ post-secondary education and 

"af^tions are not th°" ^^"^ ^""^ " ^^^^'S^ 8"^^^- These OED 

Nlti^al cLter H ELcationirstati F""^P"-« Sanerated by the 

and public agencies were 5 other private 

research efffrrf ^r^ also used. The report also cites individual university 
MeSnderlstln oflcS f ""^^^8 source of support. At least one of these""^ 
collegflt^e^to^'dr^P^t^lt"""''' longitudinal studies of* 

The Congressional Budget Office's draft analysis of federal efforts 
that provided throueh litle T Jnd » f f compensatory education beyond 

^uuuc gupporc ot high school students. No reference tn anxi f^^^^^n 
evaluation of vocational education in high schooirL mad^! We b^UevHSr"' 
scarcity of reasonable quality studies accounts for this. 
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OED supported studies of bilingual education, especially an evaluation 
by American Institutes for Research, is cited with explicit reservations. 
Descriptive data on attitudes about federally supported biltogual progrms 
are cited to assist understanding the conflict between federal and local 
views. The evidence on effects of the program is judged inconclusive by 
CBO staff. NIE supported efforts to esttoate the effects of career edu-- 
cation programs are mentioned but not cited explicitly in CBO's description 
of whether the program is effective. Similarly, OED supported evaluations of 
Upward Bound are cited in describdjig who is served and tha effect of services 
but no explicit citation is provided. 

CBO's report on day care services and the role of federal support relies 
heavily on seven major day-care studies. About 40% of the citations (pages 
5-42) refer to recent Abt Associates evaluation supported by the Admdiiistration 
for Children, Youth, and Families. The rmalnder are most frequently synthasas 
supported by the Assistant Secretary for Planning and Evaluation. No OED or NIE 
studies are cited eKplicitly, 

Apart from citation of this sort, CBO staff informed us that evaluations 
are cited in mCToranda writtm in response to Congressional inquiry. We did 
not have the time available to examtae the characcer and frequency of such use. 
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Cast Study on Use of Evaluation: 
The National Day Care Study 



support of the Ad ' p^Study was executed during 1974-79 with the 
oTfonroxL't'l ?f Children, Youth, and Families at a cost 
Of approximately $8 mllUon. Conducted by Abt Associates, it focusai on 
JaLs"f the service, provided to children "^Sr five 

wh.f>,^"' f -^"^ ^^""t^ °" staff/child ratios in centers on 

redu^S/Stlos'w^ld f""' '^^^ade quality of care, and on father 

reaucing ratios would decrease costs of federally supported day care The 

a'=Scv°ra "^"'"^ standards'and federal inter 

SurLent tud" " 'f" ^"^^'^^l Care Study was one of 

denendSt 'J general topic and has been supplemented by 

resSts we^e °ut ^"^^-'^iff^^^ «Sency staff. The uses to which 

on^e inja^^ahl " ^""'^""f/'^ respects. Some documentation 

on use IS available and we outline the evidence below. 

Federal Regulations 

Dav f CS had a direct bearing on revision of the Federal Interagency 

S^o?the"SS1^:-?rAct/" '''' " funded undef n^l. 

In particular, proposals for new regulations were issued in the Federal 

altered to reduce cost r,f staff /child ratio in the classroom can be 

Ti.. ^^o^^^^t^i:^^^^^^-,^o.. of quality in care, 
smaller being better eineranv f constraints on group size, the 

and three policy options on r^o ? °f staff/ratio requirements, 

Senerall, il^t^ ories"|c^^d d ^thf Sjf^o s^\f'"^' 
requirements, and employ one of t „ i ? f^^^ ^o specify group and staff 

and etaff child ratils fp^! 17877-I7l7l) specifying group si^e 

t^^iJ^^ Study "found a critical relationship between sneclAH,»rf . ^ 
training and the quality of day care children receive f^^^^^^ caregiver 
lations Incorporate the finding by requirlnft^fl^ Tf' * Proposed regu- 

in "specialized ffa-f„^„ ' requiring that all caregivers particloate 

»enf^Cp 347I7) Se ^ ^""^fS ^"i^ ^thln six months^ofe^pf 

«qulre that "a i J^lJ^J^^^t^'^,"^ regulations which 

credential regularly pfrlicLrff? ^ national recognized child development 
Further, the/ltudy f'ound no cieL r^^^^^^ ^° (P- 17872). 

development of chtldren and Itl^l measure of welfare or 

expwlence. and ob«e"ed tLriftSr" '^^^^ related 

Into future federal Puria'tog^egulatloL-'f "rir?^''"^" '° incorporate either 
echoes this observation by nof spfcifv^n^ ent^ ' t"" 1' regulation 
caregivers, ^ specifying entry level requirements for center 
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The Study recommended that the federal requirements baring on size 
of children's groups and staffing "be determined on the basis of scheduled 
enrollmmt of children rather than their actual attendance" (p. 34762). 
The recommendation was offered partly in recognition of confusion caused 
by children s absenteeism encountered at the Cmtera by Study staff. The 
final regulations permit designation of group size on the basis of either 
enrollment or attendance, the deviation from original NDCS recommendations 
being influence at least partly by independent commentary on the proposed 
regulations arising from variations in state practices. 

Oversig ht and the Con^rpfis 

There was some use of the study by the Congress and the U, General 
Accounting Office in the following senDe. Senators Robert Packwood and 
Russell Long ask^ the U.S, General Accounting Office for advice about day 
care regulations* In a written response of SeptMber 25, 1979, Comptroller 
General Staats recomnended that the new regulations be based on the National 
^y Care Study and on the GAO's reanalysis of the TOrk. Staats' letter 
^ eiterated NDCS study findings that early regulations on staff/Ghlld ratios 
and training were too stringaic and good quality care can be obtained by 
relaxing these requiremQits within Itaits. MDreover, GAO-s independent 
study of day care was used as a partial basis for verifying conclusions. 

One rather important but difficult to verify use of the NDCS concerns 
resolving a ten year debate between Senators Long and Packwood and the 
Department of DHEW on day care regulations. The discussion concerned the 
stringency of regulations, the problem that some states could not meet 
requiremOTts, and the Congress's continuing resolution permitting waiver 
of regulations for such states. The NDCS study provided a basis for resolvin 
differences between the Congress and the agency, and for eliminating the need 
for the waiver resolution. 

There is some indication of use at the general policy level in the 
following sense. The Council on Wage and Priee Stability review^ the 
study and complimented DHEW on its conduct in a Council document dated 
September 24, Docket No, 79^184-32, In its issues and options paper on 
federally supported child care, the Congressional Budget Office relied on 
seven major studies of day--care services including the NDCS. Their citation 
of NDCS accounts for over a third of the citations in their discussion of 
who is served, whether services are effective, and the need for services. 

Verification of factual information on use of the National Day Care 
Study was obtained in telephone interviews with Richard Ruopp of Bank Street 
College, Jeffrey Travers of the National Acadmy of Sciences, and Herb 
Mlllstein of the U,S, General Accounting Office, 
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Case Study on the Use of Evaluation^ 
The Joint Disseniinatlon Review Panel 



Hi_gtQ_ry 

The prasent Joint DlssOTination Review Panel (JDRP) was begun in 
1972 by Dr* John Evans and Co^lssioner Sidney Marland as the Office of 
Education Dissemination Review Panel. The JDRP includes representatives 
of both the Office of Education (USOE) and the National Institute of Edu- 
cation CNIE) . Its purpose is to provide an internal quality control 
mechanism over the use of federal funds for the dissemination of educational 
products and practices for ^Ich claims of effectiveness are made. This 
purpose is accomplished by requiring approval by the Panel that there is 
sufficient evaluative evidence to substantiate the claims made about the 
Innovations, The programs and practices must also pass a pre^revlew 
screentog for social fairness and absence of potentially harmful side 
effects which is done by the relevant Education Division program office. 
Most of the approved programs have been actively disseminated by USOE's 
Division of Educational Replication through the National Diffusion Network. 
Of 421 submissions to JDRP, 245 (57.7%) have been approved. Of those 
approved, 60% (149) were developed using Title I, Title III| or Title IV 
money. All approved programs are described in Educational Programs That 
Work . 

Structure and O peration 

The JDRP consists of 22 members, 11 each from USOE and NIE, who serve 
two year terms, M^bers are nominated by the CoMlssloner of Education 
and Director of NIE on the basis of their education and experience in 
evaluation and their practical knowledge of education. Panel members are 
selected on the basis of their intereit and willingness to participate. 
There is no financial or "In kind" reward for serving. The Panel Itself 
has no official status or budget within the Education Division although 
the Executive Secretary * Wt. Seymour Rubak, is a staff member of the 
USOE's DiviBion of Educational Replication. Meetings of the Panel are 
scheduled by the Executive Secretary whenever 2 or 3 submittals have been 
received. Seven to nine Panel members are scheduled to attend each meeting. 
The meetings are very Informal and open to the public. The vote on each 
submission Is taken imedlately after It is discussed and is also public. 

Pre-revieif Screening 

Individual progrm offices within the Educational Division are res^ 
ponslble for Identifying and screening potential candidates. In addition 
to making preliminary Judgments on the accuracy and adequacy of the 
evidence of effectiveness, the screening process is intended to insure 
that innovations are socially fair, free of race and sex bias for example, 
and presrat no potentially harmful side effects. The Individual responsible 
for the screening signs a transmittal memorandum assuring the Panel that 
the screening was done mid the Innovation found acceptable. A USOE 
Assistant Coimissioner or NIE Associate Director also reviews each sub- 
mittal and approves it before it is reviewed by the JDW, 

Based on conversations with agency staff, this screening process 
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fSt ralltea t ^'f"^'' "--"^""Sh two axceptiona were noted. The 

actu^ rf^^. ? S ^"^^"^ °^ evidence and Its presentation in the 

actual submittal. Program offices vary in terms of their experience with 

suL|t«? ' ^""^ ^"^"^ the persuasiveness of'the Jctual 

^ h "^fP^^dent of the quality of the evidence. Federal agency 

qu:iLro?1otft;""-f -^^^ °P-ations expressed the feelinfthat the 
?^me The III f evidence and its presentation have increased over 
wSch nof " ^^''f products may have been approved 
instance 1? wi,rh "I' °' '^f criteria. For example, we were told of an 
projram was noJIree^" ^"^f complained that a JDRP approved 

?ne these cr-^^l^ . -'^ ^"'^ ^he difficulty of apply- 

crf theism of Jndi in """^^'^ ^^^^ ^^^1 ^« Justified 

<-ricicism of individual programs or prcducts. 

J gRP Review 

to sub^tantS^ th'^^^T °" ^ evaluation evidence collected 

CO substantiate the claims of effectiveness made for the innovation TT,^ 
evidence is presented to the Panel in the form of a te^^page submlssiof 
ojm^t whf h'- information about the innovation and its devel= 

popSatiS fnf^ r^tJ^^ ""^^^ innovation and for wS 

SpSsL ii the f f .'h« °f ^'^^Ption would be. However, the major 

Srs affec^LfSf effectiveness ind 
H« affecting its credibility. The criterion used by the Panel are 

oHo h1ersuf:^v"a'd "^'^ nLe^:us exLpIes 

be relli^r f "^P^""^8ive evidence. Briefly the evidence must 

tL^enf and valid. The effects must be both statistically and educa- 
observed oifco"'- ^h«^™"" ^e credible evidence tLt the 

observed outcomes were caused by the program and would not have occurred 

Stes at fr^a' ^f'"' innovation should be generali^ble to other 

Sites ac a rMsonabla cost* 

JDRP ™^ convBrsations with JDRP members and minutes of the 

b^Se^ hi^'ro^ram'^f -'"^ " "^''"^ "S"™"^ "^^^ causal link 

approval tJ^. outcomes is one of the prtaary requirements for 
anv f;^«inis Provide Panel members with an opportunity to cLrifv 

=^kiSr ^fsub^lss^or: f T''""' ^-^^^^^ and^glves the IndJviluaL 
exemnLt submission a chance to convince the Panel that their program is 
ex^plary. The votes are almost always accompanied by comments 
which provide reasons for the Panel's decision. - ^ 

JDRP Performance 

fe.w^*"'r?l! "° ^""^^ evaluation of the JDRP has been done, it is the 
feeling of the people to whom we talked that the Panel is serving an 
important function and serving it well. The standards it uses for review- 
ing evidence of effectiveness conform reasonably well to other reLtel 

S"h%L°Liv'''fK'^\'""'°"^' ^-S" d by tL Lalia ion 

Eva^uatLn Shat^ltn t*^%f Connnittee on Standards for Educational 
1977b Cr^^dan 1975? " f " ^" published form (Tallmadge, 

p t,randall, 1975) suggests reasonable adherence to the criteria 

CEmrick rSIn T T'""-°" °' National Diffusion NetwoK 
Si schLfLstrLl -d°P*i°"« JDRP approved programs by 

-oca. school districts. However, not much evidence of continued 
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effectiveness Is yet available. The cost of adoption has been about 
$4000 on average, while the average coat of program development was about 
$300 000. Major criticisms of the Panel Involve three issues! the rigor 
with which the criteria are applied (c.f. Tallmadge, 1977a), the Panel^s 
decision not to review proprietary products, and the issue of equity of 
opportunity in providing convincing evidence. Each of these issues is 
discussed by Datta (1977) . 

The Information contained in this case study ims obtained from the 
references cited below, minutes of JDRP meetings, and Interviews with the 
following individuals I 

Dr. John Evans, Assistant Commissioner, Office of Evaluation and Dissemina- 
tion, USOE , , . 
Maryann Mllsap, Senior Associate, Teaching, Assessment and Evaluation, NIE 
Ann Bezdek, Office of Evaluation and Dissemination, USOE 

Jeff Schiller. Assistant Director, Teaching, Assessment, and Evaluation, NIE 
Seymour Rubak, Executive Secretary, JDRP, USOE 
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Case Study on the Use of Evaluations 
The Impact Aid Program 



Af# Aft evaluation of the program for School Assistance for Federally 
Affected Areas (Impact Aid) was conducted by Uwrence L. Brown III Alan 

f ha -la^bs, of the Office of tha Assistant Sacretarrtor 
Planning and Evaluation. DHEW. The study was Initiated to assist the 
development of the Administration's proposal on Impact Aid for the eIu- 

Co™^?^i 'J^f f P'^- °f the House Appropriations 

Committee and that of Education and Labor had requested s"h I study In 

I II I ^""^ The study assessed revisions 

f?o^ FY iqffi^.r -^'^ Educational Amendments of 1974, using data 

trom FY 1976, the first year of Implanentation. 

iustll^Jhf ^i'^f ^r^?" concluded first, that the government could 
tlv^tf^ f ""^^ children associated with Federal ac= 

T«^«^^ f J f 1 deprive districts of taxes. This would eliminate 

Impact Aid to districts with parents who worked to federal jobs out-of! 
t^f ? ^^ "5? to districts with locally=owned public housing. However 
t^at th^^ f% ^« ASPE group concluded 

lull dlLr? . IaP«« Aid would only be equitable if Title I funds to 

sucn districts were Increased* 

A second conclusion was that the current formula for compensating 

whllel "f?oor"ol h'ff'^J" ""'''f' '° ^"""^ ^^^^ °' compensation, 
While a floor of half the national average expenditure per pupil did not 

reflect disparities of expenditures across states. Although li of funds 
were allocated to. relatively wealthy districts that were only llghtlylmpacted 
real burl^'f"' activities, heavily impacted districts showeTa 

for fu^J? "5 °* revenues, depended heavily on Impact Aid 

states? ""^ ^ comparison to the average for their respective 

The ASPE researchers also found that Impact Aid might actually impede 
currL'tl f P"y"i"i their own equalization money, because Impact'^Ald'^as 
currently formulated offered them no Incentive to do so. 

The ASPE report presented three comprehensive reform packages, which 
varxed according to their departure from current practice and dlgr^e of 
cost savings. The Administration chose the second of these packages, which 

SludeH "°^««tely/"" practice. This set of recoLendIt ions 

included the following provisions: 

1) That payments for children who work outside the county be 

elliiilnated. 

2) That payments for children residing in low rent housing be 

gradually withdrawn. 



3) 



That the method of computing compensation by local rates be 
eluninaced for all but the most heavily Impacted districts, 
and that the "floor" rate of 1/2 the national average be 
eliminated. 
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4) That a 3 percent "absorption" provision eliminate parents for 

federal children equal to 3 percent of the district's non= 
federal enrollment (eltainating lightly tapacted districts) . 

5) That the "tier" systm of funds allocation be eliminated. 

Each of these provisions tos offered by the ASPE report. Moreover , 
the Aitoinistration offered findings from the report as justification 
for the suggested provisions. 

However p the Administration's prtposal on ^pact Aid wag not adopted 
by Congreas, The Senate Appropriations Committee did make some uae of 
the information and expertise that the ^PE group had gained. According 
to Congressional staff member Bmi Hunts this Inforoatlon ras useful in 
discussions about ways to cut the program. Although approprtatloni were 
fjjially cut, it is not possible to say that the ASFE evaluation had a 
great deal to da with it. 

In creating this case studyj we interviewed Lawence Brown and Marti 
Jacobs^ formerly of ASPE^ and SBm Hunt. 



Refer ^ces 



Califanoj J* Prepared testtoony before the House Comnittee on Education 
and Labor , In Hear tags on rMUthoriaatlon of ESEA, Part 3/ 1977, 

Impact Aid two y^rs later i An assessment of the program as modified by 
the 1974 Education Mendments. 
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Case Study on^the.U^#of EvaluatJ^ys 
ProblTOs m Title I Testing '^'^J 
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Case Study on the Use of Evaluation I 
Federal Programs Supporting Educational Change 



In light of the growing number of federal programs designed to produce 
changes 1m education through local grants and projects, the Office of Evalua 
tlon and Dissemteation of OE awarded a contract to RMD Corporation in 1973 
to examine the adoption of innovations In school districts, Paul Berman 
and Milbrey McLaughlin, the principal Investigators * examteed local adop-- 
tlons of four "change agent-* programs i ESEA Title IV-C (at that ttoes 
Title III), Biltagual Education^ Vocational Education Exmplary ProjectSi 
and Right to Read* RAND surveyed school district personnel ^ from super Jjti- 
tendents to teachers^ in 293 districts^ and studied 29 others In the field. 
The Jjivestigatora Interviewed state and federal officials involved In the 
change agent progrmtis. 

Bernian and McLaughliJi believe that although some school districts ini- 
tiated projects to try to solve problanSj others initiated projects primari- 
ly to obtain federal funds. Projects that were started iji order to get 
federal money were not successfully Implonented^ because they did not 
have the coranitment of local participants* Some of the problem-'SolvIng 
projects also fell apart because they did not choose a flexible strategy 
that accoramodated the ^ist^g district organiiatlon. Four major factors 
influenced the continuation of the projects after federal money ended. 
These were: centrallty of project goals to those of the district | dCTiands 
placed on teachers to change (If light, more change) | complexity of imple- 
mentation; and consonance between the project philosophy and that of the 
district* 

The findings of the RMD study influenced the reauthorization of ESEA 
Title IV-C In 1978* Citing the HMD study, both the House and Senate 
reports noted that 

An evaluation by the Rand Corporation of Federal funds for innovation 
found that these funds had a major effect in stlmulattag local dis- 
tricts to undertake projects that were generally consistent with 
categorical guidelines. However,' factors at the local level resulted 
In successful Implementation of only some of these projects and long^ 
run continuation of ^ven fewer. (House report, p. 61, Smate report, 
p. 50) 

At a later point, the House report noted that a perceived need is for 
more Innovative approaches to compensatory education* Fomer Secretary 
Galifano had advocated allocation of funds for this purpose* Again citing 
Rand's conclusion that providing federal funds can catnlyze local commit-* 
ment to such projects, the House proposed an amendment such that half of 
additional appropriated funds in Title IV-C would be used for innovations 
In compensatory education. This proposal subsequently became law (Title 
1V=C section 431) . 

In noting the frequent lack of local comitment to continuing projects 
after termination of federal money, the House report notes that the 
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T^fu ^ r ^"^^^ explicit, so that other districts 

f i f ? i Citing the Rand report finding that 

few districts prepared thmselves for termination of federal funding thf 
House report set forth an amendment specifying that funds for projects 
vear th f f I a ma^toum of five years. Beginning in fhe third 

year, the level of federal funding would start to decline. This amendment 
subsequently became law (Title IV-C, section 432). amenamenc 

and testified before the House Subconmletee on Elementary 

o?f presenting his evidence. In addition, several ' 

otticials in HEW ware instrumental in incorporating the RAND findings into 
of Nir"^«''l "r/f currently an associate'director 
revi"'/^'."!^:^ «™ P°"=y analysis for the reauthorization. Ha 
for "han."^ ^h*"^ the avaluatlons sponsored by OE to find some common threads 
for changes that might be made in the structure of programs. Of all the 
evaluations he reviewed, the RAND study was most valuable t^ give a sense of 
ias not'^^^'f delivery Although the legislative proposal hi helped develop 

taken from th^J^"'"*"^ submitted, the final proposL did retain ideas ^ 
caKen rrom Cna R^TO report . 

re^i^^^JT"^^^- assistant to Itarshall Smith, then Associate 
commissioner for Policy Studies, furlng the period of the reauthorization 
S^^- " ' ?^ mentioned that her office was disturbed at the RAND finding 
that innovative projects were isolated from the rest of the school. Ihey- 
fherefore inserted language in the Administration's proposal that local 

over ttoe "whilf th/fpH^I^'T ''^ ^^^^ '^"^ money on the project themselves, 
over time, while the federal share would decrease. Marshall Smith's office 

toto Sfrest of S^^^f '^^^ be integrated 

ajicQ tne rest of the school. 

Smith's Office was also Impressed by evidence from the RAND report 
and from an evaluation of compensatory reading programs, that the school 
buildmg is the level at which Improv^ent take place, more than the district 
nronoLl tb't J"?;! therefore inserted language in the Administration 

proposal that Title I projects as well as Title IV-C could involve innovations 
in compensatory education, implementing the building wide approach. 

The RAND report has also been used ±n the Office of Education in the 
management of the National Diffusion Network. Specifically, the report 
gave managers of this network information on areas of assistance t ey 
could provide to local lmplementors-= the importance of adapting innova- 
tions to local circumstances. 

The report has also been useful to states. The director of Title IV-C 
In the state of Kansas said that the RAND report had served as the basis 
for changes in the program. For example, Kansas did not allow funding 
of projects beyond 3 years, except for the dissemination of project Ideas 
to other districts. The RAND report convinced the director that a fourth 
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year of fiindtrig for the school district that developed the original project 
was in order, so tl^t the Innovation could become institutionalised in the 
district. In general j the Rand report confirmed the director's om informa-" 
tlon that little staps and gradual changes are more likely to be accepted 
by school districts, ^ia la the advice that the director provides to 
local Implementors * 

Legiilators in the state of California Incorporated information from 
the RAND report in formulating two bills ^ich were passed in 1977. The 
first, AB 65, required LEAs to adopt minima standards of proficiency for 
students to receive degrees* However, these standarda were to be developed 
locallyi consistent wLth a state assessment frfflaework* The rationale for 
local control i discussed by Asiemblyman Gary Hartp was to be found in 
the BAND study s local development of projects leads to better implementa^ 
tion* 

Bill AB 551 waa also passed in California In 1977* It has become known 
as the School Improvement Act* It provided for staff development in LEAs* 
LEAs were to submit locally developed plans for development, and a waiver 
proviilon explicitly noted that requirements of the Bill should not stand 
in the way of needed and beneficial Innovation. Again, Assemblyman Hart's 
ratioriale for the Bill referred to the importance of local development 
and explicitly to a paper derived from the jRAND report by Mllbury McLaughlin 
dlscuastag successful and unsuccessful staff development programs. 

One specific finding of the RAND report has received a great deal 
of publicity because It has fit the spirit of the times* The finding 
that local development of projects was superior to advice of outside 
experts was extensively cited by newspapers, for example* Nil's Lois-ellin 
Datta cites cOTmentary and editorials on the RMID finding from Science 
News , The Dally Qklahoman , and the Rapid City (South Dakota) Journal * 

However, Datta criticises the specific find tog that locally developed 
pro j eats are more likely to be implemented , on the basis of the HAND data 
itself. A follow-up survey Indicates that no ^perts, inside or out of 
the LEA, are particularly influential , but outside experts are perceived 
by teachers as being more useful among those projects that were successful 
in being adopted, Datta therefore believes that this particular conclusion 
of the bM(D study should no longer be cited* However i this does not invalid 
date other findings of the study ^ No Mjor secondary analysis has been 
undertaken « 

The following people were interviewed in the development of this case 
study I Linda Bond, former legislative assistant to Assemblyman Gary Hart 
of California I Mllbury McLaughlin, co-prlnclpal investigator of the RAND 
report on "Federal Programs Supporting Educational Change"'| Phillip Thomas, 
Director of Title IV-C programs, Kansaa State Departoent of Education i ^rk 
Tucker, Assistant Director of NlEl Brenda Turnbull, former assistant to 
Marshall Smith, feacutive ^slstant to the Secretary of Education. 
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Case Study of the Use of Evaluation; 
Follow Through 

The Pollow Through prograra was developed In part to sustain the 
effects of Headstart and other preschool programs on performance and other 
characteristics of chlldrai from low Income families. Early appropriations 
were clearly not sufficient to meet a general objective of servlni all rele- 
vant students. Consequently, the U.S. Office of Education shifted the 
Follow Through anphasls to assessing promising approaches to compensatory 
education and away from general service delivery which could not be met. 
Local education agencies Involved in the program were then encouraged to 
^^th« l^L^f several models or approaches to educating participating children 
m the interest of better testing the af fectlveness of the models. The 
models, developed by educational organizations and agencies and colleges 
dltfered in kaphas Is, some stressing emotional and social development, others 
stressing direct instruction, still others the role of parents. 

In 1977, Abt Associates produced a report evaluating the Follow Through 
Planned Variations Experiment. The principal conclusions were: There were 
greater variations between sites using a single model than among the models 
thanselves. Follow Through and non-Follow Through children performed about 
the same, and in some cases Follow Through children's performance higher oi 
lower. Follow Through children were .still scoring below grade level after 
several years in the program. Particular models, specialized approaches to 
education, were contrasted for their overall performance and there is some 
evidence that one such model had good results. 

Judging from our Interviews and documentation, the evaluation has had 
little effect on the Follow Through Program itself. The current director 
has not used the evaluation. Although Office of Education has attempted to 
reduce funds allocated to the program, this initiative has not been based 
heavily on the evaluation. Rather, OS's rationale Is that the program was 
experimental in nature, and the experiment is completed. Mthough the 
program does face some budget cuts, it Is not clear how these will be dis- 
tributed among the Follow Through developers. 

The evaluation has been discussed In public formas. Congressional 
hearings on Follow Through were conducted by the House Education and Labor 
Committee and by the Senate Human Resources Coiranlttee. A House staff member 
requested Information about the findings from the study's project monitor at 
OED. The Senate heard primarily from the local directors of the Follow 
Through project models, and the committee's hearings then reflect only 
criticisms of the Abt Associates report (references below) . 

The evaluation was highly publicized. Several newspaper articles, 
clced below, anphaslzed a secondary, highly controversial finding that Follow 
Through models stressing basic skills performed, on the average, better than 
others. In spite of great variations among sites. This may have Influenced 
strengthened, or justified the "back to basics" movement in education among 
readers of these articles. 
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of fht ""^^f^-'-y/* factors appears to have impaded the clear policy use 

as a aervJJ « " f ^^"''^ ">'^"8h progr«n. Originally intended 
to reglrf tl forSlfv " led tha Office oi Edueation 

1.0 regard it formally as an experimental program instead Ttm n^^c,^r,a^ 
Ptograa design deteriorated in the face of management pribleL and>essure 
plogrS!'"" Ihrouih'as a sariLa'dellvery 

of the^Fonow'T^.r*i"".^°? has been subject to much debate from sponsors 
?, ; J f°''-^°'' Through models and from evaluation specialists A crltlaue 

problem ' 'xh^'^ltio""''"??' '° data analysis and "once L'af"^' 

evaluS;n thf ^^""^ confirmed some earlier criticism of the Follow Through 

^'t^°"8h sponsors of the variols 
crlticlS Irev tSt fhf T Z*^^"*''^^ '° measured in the evaluation, 
some of SiEh w^re so varlo^" f measuring the sponsors' objectives, 

to measu« th^ Tf lM^r ^"^^ ""-^"^ evaluator would be hard pressed 

M^dS I^d ? adaition. some staff members of Interdependent Learning 

^ Ss^fc . the^ro intended that the evaluators wantef 

biasa-SiberaterigS^et" ^^^^ "° .valuation was 

Finally. Follow Through has had a vocal constituency preaslnB for 
continuance of the program. When cuts were threatened In p^ofram for 
instance. Follow Through parents and teachers wrote in to theirSisfmL 
supporting the program. They clearly value the serviL LllJery sWe of tL 

^° leglsUtivl WlILngnesf tf ' 

the rJghr^as^s^^^d^^J^^^S"^^^^ for 
the program eKacerbatel problSs about °^f-,C°%f-ion about the nature of 
ance of Its findings. The evaluation h^=\ ° " " evaluation and accept- 
in the "basics are'bettej" mova^^^t ll has'al^f ^'"^'^^^^ P«P°«««» 

search, such as Cronbach'^ jTI^Ssins t^f contributed to academic re- 

evaluations. And It has Jon^rlbutefr^ the proper approaches to educational 
judge by the GAO's descriptLn o^lessLs l"r^Tf ^t^^'^^^ « -e may 
design and admlnistrative^procedures ^ evaluJ^ f evaluation 
clear linkage between results of the eSaWlo^ ; discovered no 

program apart from unsucce^<,f„1 .J ? evaluation and major modification of the 
funding level. ""^"^"ssful administrative efforts to reduce the program's 

Director ^^1^^^^ ^i^^f' ^'^^'^^^ ^^hn Evans, 

technical history of FolW ?hf ^J? ""'^^^ Institute and author of a 
administrator with thfJa^sariolf T^^^^ ^'8"" Stivers, an 

Abt Associates Ind onl oft" invest l' "^ ^t . Pierre of 

Wilson. Director of Follow ThroughfSr^^ll, J:,^^ SS'""'^"^ ^^^^'^^ 
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Case Study; 

Use of an Exploratory Evaluation of the Follow Through Program 

The OffiGe of the Assistant Secretary for Planning and Evaluation 
conducted an exploratory evaluation of the Follow Through Program in 
1979. The evaluation found that chere were conflittlng views of the 
mission of the program that needed resolution. There were also dis- 
agreements about the management objectives of the program. Fdllow 
Through Central Office had no procedures for either producing effective 
services or for evaluating the effectiveness of those services. Finally, 
there was consensus in OE that Follow Through was unsatisfactory as it 
was functioning. Serious problMa in the aMlysls were reiterated, howeye 

The Assistant Secratary for Education Issued directives in 1979 
that reorganiged Follow Through at the national level and gave a new 
role to sponsors of projects, who were to concentrate more on local 
service and on knowledge^producing activities. 

In addition. Follow Through was to develop performance indicators 
to monitor projects. A contract for this purpose has been awarded to 
Applied Management Systems, The development of these indicators was 
a major recoraiendation of the exploratory evaluation. 

Another recoDmiendation was that the research function should be 
returned to Follow Through itself, instead of OED, and should be staffed 
with qualified people. The Fdilow Through Office is waiting to implement 
this reconmiendation, which was authorized by Assistant Secretary Berry, 
but until organlEatiori of the new Department of Education Is complete,' 
this new staffing must wait* 

The eKploratory evaluation suggested three areas for further 
research I extending the Follow Through program through the 6th grade 
varying levels of funding for Follow Through projects, and examining 
the effectiveness of self-sponsored projects. A contract has been 
let through the Follow Through Office to Boone-Young and Associates 
to carry out these studies. 

To generate this case study we interviewed Rosemary Wilson, Director 
of the Follow Through Program, and supplemented this interview with 
existing written information. 



References 

Assistant Secretary for Planning and Evaluation, Final Ruport on Follow 
Through . Washington, D.C.' Department of Healthj Education and 
Welfare, Office of the Assistant Secretary for Planning and Evalua*- 
tion. Office of Evaluation and Technical AnalysiB, 1979, EDC 1012. 

Assistant Secretary for Planning and Evaluation* Report on eva_luatlQ_n 
utilization in the Department of Healthy Education and" Welfare . 
Washington, D.C.i Office of the Assistant Secretary for Planning 
and Evaluation, DHEW* 1980. 



271 



6-68 



Case Study on the Use of Evaluationsi 
The Providence j R, I, School District 



Dr, Ron Visco has been the DlreGtor of Title I evaluation in Providence 
for close to three years. He was able to docmnent a variety of Instances in 
which the school district adopted recoinmandations he made on the basis of 
evaluations- Providence turns out an interitn and a final evaluation report 
each year* documents the recomnendatlons that were made, and the action that 
was taken on the reconmiendations. This also occurs for reports on the Title I 
program to the state* 

In s t rue t ion 

In 1977 and 1978, Visco investigated the effects of the cut-off criteria 
for Title I on reading and math and on services to children under Title I. 
There were two levels for selection criteria. If children scored between one 
half standard deviation and one standard deviation below the local mean on 
reading tests, they were given 60 minutes per week instruction time in reading 
under Title I. If they scored below one standard deviation under the local 
mean, they were given 150 minutes of instruction per week. For math, these 
criteria were similar, except that if a child were receiving 150 minutes in 
reading, he or she could receive no more than 60 minutes in math. 

Dr. Vlscp found that for both reading and math, 60 minutes of tastructional 
time had, at best, no effect, and possibly even detrimental effects* On the 
other hand, 150 minutes appeared to have a positive effect although there were 
some problems in Interpratatlon . Dr. Visco therefore reconmended that all 
children would receive 150 minutes of instruction, and that in math, the 60 
minute service category not be considered again. The Providence Title I 
prograin lias instituted this change • 

In middle schools, the r^flding instruction Involved two AS-m.inute periods 
per week. The results were poor, Visco recommended in 1978-79 that the middle 
schools Increase Instructional ttoe. This recommendation was based on inter= 
views and on data. Providence adopted this recomnendation in 1979=80 p increasing 
the number of periods to 3, 

In 1978-79, a major drbate concerned whether Title I classes should pull 
children out of their regular class to a resource room, or whether the Title I 
teacher should use the regular classroom, but teach Title I students in a corner. 
In 1978-79^ a deci^^ion waa made by Title I administration that almost all Title I 
Instrurtlon would occur In the regular classroom, Thl^ created problems* First,, 
almost all schools have excellent facilities in their resource rooms, but these 
matpri.il.q are not brought by teachers into the regular classroom, for a variety of 
reasons p Spcond, conducting two classes in one room nan be extremely noisy, or uncom^ 
for table for the Title T students. Visco administered questionnaires to Title I 
and regular teachers ^ and Interviewed the Title I teachers. At the end of the 
quest lonnairep he asked what would the teacher suggest about where Title I 
instruction should be performed* Of 131 questionnaires, only 5?i responded that 
the regular classroom was preferable. 51% believed that the resource room was 
preferable. Title I teachers overwhelmirAgly preferred the resource room, or at 
least having a choice. Principals had more mlKed opinions. Visco talked to 15 
TlcId T students and each preferred the resource room. 
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fll^H »J? I staff opposed distribution of the questionnaire, teachers 

"aco lee^^nd'H ^ i«netal\ha issue was very hot. 

tifla^^^; ^'f -^""^"^ *^^"8S! that more flexibility be allowd- tLt 

of claLrf ^'^'^ °" P^y^i^^l structure ofthe^Lol, 

size of classrooms, size of regular and Title I classes, etc. Moreover 

to rL'°"* ^■•^"ftlons. he recommended that the Title 1 leach;r be allowed 

JssL C'stm not °- discretion! The 

^ore flLlbiUty! «-Pl"ely. However. Title I teachers are given 

were Secue^tlf f 1 ""f 'L""" °' ^ ^^^'^^^^ determined that they 

rSrnine ^"L"f students, which could potentially Impair 

teacher - 'and therr'? ! "^"^^ ^ ^^^^^n no more than 7 students pel 

llrlrl' aiid that Instead of being forced to take more students teachers 
agree to having more students. Providence has adopted this pro;osat? 

and JJh" S'r.^^""*^ f^l "^"^ ^ teachers wanted more workshops on reading 
4 pef y^r Provl^'efl/^'f /"S workshops occur regularly and no fewer tl^n 
per year. Providence adopted this recommendation. 

all adont»r^'^'"^"^°".'^'^ "^tral high school were almost 

^i'snSlL ?itlf nSsses 'l^^hff hf °" number of unexcu.ed 

^ J ^ ^-Lasses, ihls had been admin Igt rat ivelv f^iff■^mf^^ f/s^ 

Sr the'r^^dr"'"- ^^-o^^ndatlons Included entry and exit criteria 

therefore chJnB^f?^ / performed evaluations with all three models. They 

Earlv rMlSl ! S objectives when Visco pointed this out to them. The 

^^tLts "r f oaLlcu? "'f " of The tin 

Z Tilht nf . .r 1 " P^=^^«^ that this was not sensible 

cl^ndf en t^hat tadlcated that only one^third of non-Title I 

children In the normmg group were able to pass any single subtest. 

Classification and Achievement Testing 

The ©valuator discovered that clBrleal *fT-r.^o ,« = 
children from receiving Title I serJlclt ^!^n 5 P«venting eligible 

search for nhiiA^^r. t J Z f^""^s. He recommended that a computer 
on^lLv . 1 f 2 ^^^-^ on their scores be instituted, rather than relyine 
^dat"n\fhavL ^b ^"^^ ^ P"grim adopted thislecoL^ 

Slldre^ - °°^P^"1' P"^"« with rankings of the 

er^of r'te'^ii 1978"alt'' " alphabetlcally. When Viscf examined the 

errora^f , J Procedure was Instituted, he found no clerical 

errors in math, and only 4 In readJjig, out of a sample of 450. Only 1 of thwe 

in l97/"r " ''^^ ""'^^8 Title I services and this was rLdle^. 

in ly/y, there were no errors at all. 



273 



6^70 



Providence's Title I progr^ has a challenge systan, whereby a teacher 
can challenge a student's scores on the CTBS, the test they use for the 
criteria for Title I, if the teacher believes that the child Is actually 
performing worse than his score would indicate, Vlsco examined major 
problms In mlsclassif ication of children in grades 1 and kindergarten. The 
reason for the misclasslf Ication was that the test was given during the 
first week of school ^ when It would be least indicative of the student -s 
achievements. The test was also administered improperly^ according to Visco, 
quite frequently* Visco therefore reconnnended that the challenge practice be 
extended to ktodergarten and first f/uade^ where they had not been previously. 
He also raconmiended that the challenge period be extmded, tn general , because 
he felt that it was too brief and also occurred too early in the school year 
for many teachers to be able to respond. Providence adopted this recoiranen- 
dation. In 1978=1979, they extended the deadline of the challenge period. In 
1979=1980^ challenges were permissible all year. 

Visco felt that one of the biggest problms was the selection instrument 
for kindergarten children, called SEARCH, Norms for the test had bem based 
on children at the end of kindergarten and beglrmlng of first grade. However, 
Providence used the instrument at the beginning of klndergartan. This was 
inappropriate for several reasons^ e.g., of the children receiving Title I, 
98,6% got a score of 0 on one subtest. Moreover, the cutoff scores in the 
test had norms for children of, for example, 70 months j but these were children 
at the end of kindergarten ^ not the beginning. Visco recommended that the 
school district eliminate SEARCH^ or falling that, get rid of the current cut-=off 
scores in the testj and obtain local norms for determining the cut-off scores. 
Providence did this. 



In several grades ^ the Title 1 cut-off scores were below the level that 
one would obtain by chance^ merely guessing the answers. Part of the problem 
was that cut-off scores were developed in the district as a whole, but were 
used in Title I schools only, Visco therefore offered options to solve these 
problems* However, this yearj almost all the Providence schools now receive 
some Title I funds, making the contribution of non-^Title I schools trivial. In 
this case, his recommendation was irrelevant, although the guessing problem 
remains , 



One of Providence>'s programs is Eiiglish as a Second Language. The test 
they used to determine Ghildren»s eligibility for the program was the Tests 
of Proficiency in EnglisU. This test was Inappropriate, because It was 
developed in Wales and normalized on East Asian children. The language tapes 
relied on people with British accents and the picture materials referred to 
British objrr:ts and customs that Providence Title I children could not under-= 
stand. Hie reconmendation was to discontinue this test, and Providence complied. 

Some of his recoTOiendations in 1978«79 became irrelevant . For example 
he made some recoranendations dealing with the Metropolitan Readiness Test for 
first graders, but as the Title I program for this grade was cancelled the 
recommei'dation was no longer relevant. 
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8«tJf hls'°ScoZ^Lf^'^°"^-'^?"'" "^^ i"fl"^-"'i t>l= success 1„ 

a-ccing tiia recommendations implemented. One is the posslbllltv that the .taf. 

"SeTto^d^^"' """^"^ reco^endatlona and Title I responLs^/J^vf ^"ed 

an"\%r^ lten?^^"°It'-ch^„|e's"''''Al'so"h' T "T" '^^^Ph-tic 

AJ . ^ * cnanges. Also, he has learned to speak to the 

- Sollall^' f ^h^y the recoLend^lons In writing 

suf?Je?ent tJe °?"^"^^^^- "8«ds merely "suggesting" changes as^n"* 

suffxcxent - they won't pay attention unless a change is a formal recommendation 
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Case- Study on the Use of Evaluations 
Parent Mvisory Conmilttaes 

Title I legislation provides that Parent Advisory CoTMittees (PACs) 
shall be given "responsibility for advising [the LEA] in planning for, 
and iiQplmentation and evaluation of, its programs and projects" under 
Title I. However t the law does not specify how parents are to advise 
local education agencies. Our site visits and independent work confirm 
that there is a wide variation in the extent of parent involvement with 
and use of evaluations* In some districts ^ the Title I staff or evaluator 
merely provides parents with reports- In other districts^ PACs are mor^ 
actively involved in the evaluation itself, reports are interpreted to 
them, and so on* 

We Interviewed several people with national advisory roles bearing 
on PACs, These interviews gave us a perspective on the variation in 
parental involvMent. For examples PAC's in South Carolina , Georgia, 
Alabama, and Mississippi have not used evaluations, to the best knowledge 
of Hayes Mizell, director of a group offering technical assistance to 
PACs* However, access to reports is a major problTO in these states • 
Moreover, the evaluations frequently contain technical jargon that obfuscates 
the content. School administrators challenge negative findings about programs 
on the basis of the methodology or measurement used. Parents do not have the 
training to judge for th^selves the quality of the evaluation* 

On the other hand, the Washington, D.C. * district-level PACs are ex- 
tensively Jjivolved in evaluation, according to Tom Heatley* His organization 
has provided technical assistance to PAC's in 40 Washington schools* 

Finally, there are examples of extensive use of evaluations by PACs, 
The Providence, Rhode Island^ PAC use evaluation frequently, according to 
Constance Gomes, former chairperson of the PAC, In 1979, for example, the 
PAC noted the finding from evaluation that reading scores in middle school 
were poor compared to those in elementary school. The short time spent on 
reading in middle school was one probable cause* The PAC pushed for adoption 
of the reconmendation that time in reading be increased* The LEA adopted the 
recommendation - 

In the same year, the Providence PAC strongly recommended that children 
remain in the progr^ on English as a Second Language for longer periods of 
time* This position was based on the evaluation finding that children were 
not reaching their objectives by the time they exited the program, but that 
children kho rOTained in the program longer were more succeasful* 

In 1973 evaluation revealed that the reading program was ineffective 
in Providence, The PAC advocated a new model for the program* The LEA 
instituted a new model. 
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actively Involved. Both Dr, Koff 
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according to Gomes was a major^to^ 
district . 
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involve the abstruse technlcal^sp^t 
ization gives parents a basic^tl 
TACs could serve this function " _ 

how the obiect^Jes lilt °l f f ""f "^^^^ ^^""i'^S objectives, 
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3) Access to reports must be improved. Although the law now 
stipulates such access, there ate still problems in some LEAs, 

4) Involve the parents in evaluation from the beginning. If they 
participate in the choice of evaluEtlon object iveSj their questions are 
more likely to be answered. Such participation involves reading the Title I 
applications s understanding objectives and methods to reach educational 
objectives, and the basics of evaluation* 

In developing this case study we interviewed Constance Gomes , member. 
National Advisory Council on the Education of Disadvantaged Children and 
Education Specialist, Rhode Island Legal Services; Tom Heatley, Executive 
Director, National Coalition of ESEA Title I parentsi Dr, Robert Koif, 
monber. National Advisory Council on the Education of Disadvantaged Children^ 
and Dean, School of Education, SUNY Albany; and Hayes Mlzell, Chairman, 
National Advisory Council on the Education of Disadvantaged Children, and 
Director, Southeast Education Program of the Merican Friends Service Committe 
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Case Study on the Use of Evaluation; 
The .National Study of Vocational 
Education Systems and Facilities 

Woodrnff^i""/f °f '/ ^^"^y ""d" the direction of Mien 

Woodruff provided the first comprehensive information on vocational educa- 

were studi"!' ^^^«pi^"i°n, governance, and financing of state systems 
wLlJ^j characteristics of 6,660 Institutions were described, 

including geographical distribution, facilities construction patterns 
S'lfof f °f «hops and laboratories in each Institution, con- 

hind Ica^nJ^"^ I ' accessibility of vocational education to the 

r^viTf ^"ordint to Woodruff, the study was a pioneer effort that 
revealed many gaps m knowledge. It provided a data base for the uses listed 

in rnn^^/®?°'"n^- -^""^ ^° interest groups, in that It has been cited 

AssocStSr^^He :'t'f President of the National Vocational 

Vocat.^LT;^ He cited information from the report in Oversight Hearings on , 
197f riunfzf -'^'J'-"."' ^^"^'^SS on the Youth Lployment'Aet of 

Aprli 26, 1^79 hearings before the House Appropriations Committee, 

follc3J^lJ^^ "f in meeting legislative demands for information in the 
this rSort of of rf'f ' Congressionally mandated studies have made use of 
of the S^^L^ 1 ^"^^^^ Charles Benson has made use 

cat^onJ m1 "^^'^8^ examining educational financing of Vo= 

Rasearci st dv .T' Harrison, Director of the American InStltutes for 

Smple also -The l^^^^^l Vocational Education, has made use of Westafs 
sample also. The Nawional Center for Educational Statistics was able to delete 
fn faclliIr"-°M E^^f^^io" D«a System (VEDS) the component involving data 
on facilities m Vocational Education, because Westat had already collected 

WonH.'^Jf appears to have been useful in federal management as well, 

behalf ff'thel'u'?'' '° Department of Labor in their effort; on 

in Vnf 1 % f " °" Unemployment. He provided them with data 

in Volume 1 of the report, dealing with the.^fcent to^hiflh"^feeea- f or ' » • 

- Comtisaion«r df the bureau W^Qdcfipationil 2d Adult ' ' 

Education, analyaed unpublished data from the survey to addrees questions 
cap'itf of ur^f f f-illtl.s, real levels utlll.atiLrand tJe 

^Ihl^ l f " ^^""^ ^" additional load In vocational education 

Dunham has also used information from the Westat report In Congressional test! 

j;:rIh°of'fl7randl98n ^^^f" °' °-"P«i-Il and AduJfld^c^tiL if' 

oeclallv Jh^ 4 ? ^^^'^ ""^^ Information In speeches, es- 

pecially the information on access to vocational education facilities). " 

n . Leroy Cornelson, for...arly Director of Compliance and Grants in the 

Westat rLorff f ' Education, has used information from the 

Westat report in several ways. He developed the 5 year and 1 year budgets 
for Vocational Education for the administration. He also used information 
In the report relating to the governance of the administration of Vocational 
Education in the states. For eKample. the report allowed Comelson to assess 
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the effectiveness of the proviylon that a "sole state agency" rataJji control 
over policy and decision making in Vocational Education, He used this In- 
formation in outlining issues for the reauthorization of Vocational Education 
legislation. 

The national study appears to have been of some interest at the state 
and local level. Final reports from Westat were purchased by 38 state 
agencies and 30-50 state advisory councils Vocational Education. States have 
used the data as a resource in changing their laws and policy. For example, 
Georgia sent a representative from the joint legislative committee to discuss 
the implications of the research with Woodruff, Kentucky did the same* Woodruff 
has served as a consultant to a blue ribbon panel studying the Vocational Edu- 
cation system of the state of Iowa, Maryland contracted with him as a consultant 
on educational finance, tu compare their resources in Vocational Education to 
those of other states. While these state uses are not documented. Woodruff is 
fairly sure that their thinking about their Vocatipnal Education systans changed 
as a result of considering the report. 

In developing this case study we interviewed i Dr, Charles Benson^ of 
the School of Education at the University of California at Berkeley; Dr. 
Robert Calvert , Branch Chief, Adult and Vocational Surveys and Studies, NCES; 
Dr. Leroy Cornelson, Director of State Programs^ Bureau of Occupational and 
Adult Educationi Dr. Dan Dunham^ Assistant Coranlss loner , Bureau of Occupational 
and Adult Education; Dr, Laurie Harrison, Director of the AIR study of sex equity 
In Vocational Educationi Dr. Edward Rattnar, Project Monitor of the Westat studyi 
and Dr. Allen Woodruff, Principal Investigator for the Westat study of Vocational 
Education facilities. 
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Case Study I Use of an Evaluation of 
Services to Neglected and Delinquent Youth under Title I 

Systems Development Corporation was Biven a contract to evam^t,« 

^ "itiri H °l ' ^'^^'^ """^-^^ were'bemg se"vad ' More- 

^ludentfi^ L^tlSlon"'''^'' instruction than did non-Title I 

1978 ^^V^^^i"^ information was used in the Education ^endnents of 
ovL probleL :itrif ''"ff 8^ ^""^ ""^J^ demonstrate concern 

Classes in SSLf 'T'^l °' P«ticipating students felt the litlff" 
had ?aLn%n ® and math were teaching them more than other classes they 

nad taken in these subjects." (p. 39) a B^eo mey 



. th^t Title I students would receive more instructional time 

an amendment was created to emphasize that in institutions. St leTfunds 
are to supplement, not supplant money for education provided by the states 

i^^^^jT^^'y^^^''''''''' ?he findings of the 

report influenced this decision directly. Moreover, the contractor and a« 

i^formaM^'^h'^r'^ "P^" ^PP^^^^ Admtolatr: I^^'^Ith relLble 

tlon^nd°thls°? f °' """^ ^"1^ I ^P^^t in instruc- 

Lurs o? instruct"" T " J"""^ requiring a mnimum of 5 

nours of instruction per week. However, the House Report indicates that 
the committee felt this minimum requirement to be t^i^li ^h^s amendment 
became Section 152 of Public Law 95-561. maij.. xnis amendment 

I IP.Sof !f ^d°i""trators we Interviewed noted that the intent of Title 
I legislation had already been to supplement, not supplant state funds How 
tions f "m' Neglected and Delinquent Program still Lund viola! 

reouii'^M"^ """J- '^^ amendment had the effect of emphasizing tSls 
requirement for state institutions in particular. f i mg cms 

tlnues"abouC tSe °f"^TJ Education and the Congress, discussion con- 
tinues about the means of improving services to students in the institutions 
For example, instruction of students incarcerated in adult institutions is 

oSdin Irftical f °' instr"Sn S b en 

round to be critical for achleveinent , and so there Is much debate over the 

Senator PenV f.'"f'""'°" '"^^ ^ in Institutions ■ 

l^«Mf J I introduced a bill for education of students in corrections 

institutions and, according to the prograxn manager, Congressional Itaff 
call the manager for information bearing on the bill. 

recenSv iff^^d f Program^Bvaluation of the Department of Education has 
recently issued a request for proposals for an evaluability assessment that 
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will cover nine substantive areas of education program manageniant , One of 
these J according to the SDC contractor, involves the Neglected and Delin- 
quent Program, The contractor believes Chat the RFP emerged as a direct 
consequence of the SDC evaluation's having identified problma with the 
program* 

The program manager noted that atate and local educators have taken 
the results of the evaluation back to their agencies and institutions in 
order to show these agencies that there are probl^s with existing arrange^ 
ments and to argue for improvements* We interviewed one of these educators j 
the Director of Educatipn of a school for delinquent youths. He was one 
of the program consultants for the SDC evaluation. He brought Information 
about preliminary findings back to his school ^ and gradually the teachers 
in the institution became sensitiEed to siTnilar problems in their own 
school . 

So for example, the curriculum conmittee made some major changes in 
the courses offered by the school in the fall of 1979, The SDC report 
had reinforced for them what the teachers had been saying about the school. 
The changes introduced were some mini-courses in three major areas* Some 
of these dealt with survival skills that the students would need on their 
release, such as information about family and peer relations. Others dealt 
with life skills s such as functional writing and consumer inforMtion* The 
tliird major area consisted of an orientation to industrial arts. 

The Director of Education argues that the SDC report had enabled the 
school to argue cogently before the state legislators for the changes they 
had made. They were able to go to the central corrections office and speed 
up changes that might have taken place anyways but would have taken at 
least a year longer* Because they had datSj in addition to their own ex^ 
perience with the school, they were able to refute those who opposed their 
changes * 

Several of the people we Interviewed noted that the study has been 
useful in part because of the good relations among the contractor|^ the 
project monitor J and the program manager. In fact, tjie program manager? , 
welcomed the evaluation because it brought attention to problems which 
therefore had a higher likelihood of being addressed. That they may be 
addressed is evident from legislative debates over further action j and 
dissemination of the results to state and local institutions, 

in developing this case study we interviewed Janice Mderson^ Project 
Monitor for the evaluation of Neglected and Delinquent Youth Program, 
Office of Program Evaluation; Theodore Bartell, Principal Investigator for 
the Systems Development Corporation evaluation; Ellen Balko, Procurement 
Officer for Office of Program Evaluation contracts; Chris Cross, former 
Congressional staff member; John Hoyt, Investigator for the Systems Devel- 
opment Corporation evaluation; Pat Mancinl, Education Program Specialist 
for the Neglected and Delinquent Program; Paul Miller^ Program Support 
Branch Chiefs Division of Education for the Disadvantaged • and James Wick- 
man, Director of Educgfion of the Lincoln Hills School^ Irma, Wlsconsta* 
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Case Study* Use of 
tha Sustaining Effects Study 

Tha Education Amendmants of 1974 requirad the Office of Education to 
report on the numbers of children who were economically and/or educationally 
disadvantaged who do and do not receive compensatory education services. The 
Sustaining Effects Study , carried out by Systans Development Corporation with 
Decima Research Corporation as a subcontractors supplied this required infor- 
mation* They sampled over five thousand schools ^ surveying each principal 
in order to obtain school characteristics j percentage of poor readers^ and 
source of compensatory education funds. They interviewed the families of 
a subsample of 13^000 children, to determine their economic status. 

The comparison of economically and educationally disadvantaged child-- 
ren was used eKtensively in the debate between Congressman Perkins and former 
Congressman Qule over the criterion for inclusion of Children in Title 1- 
Both Congressmen could use data from the Sustaining Effects Study to support 
their positions* For example^ the Study showed that 39% of low tacome/low 
achieving students were being served by Title I, so that Congressman Perkins 
could argue that the focus of Title I should continue to be the poor. On 
the other hand. Congressman Qule could argue that because only 40% of low 
achieving J non-^poor students were receiving any type of compensatory educa- 
tion services s the progrffln should be expanded to include the non-poor * Qule 
could argue that Title X money was going to relatively few attendance areas* 
Perkins could argue that there was a relationship between the number of 
poor children in a school and the ntmiber of low achievers , and that Title I 
funds made a greater relative contribution to poor districts* Data from 
the study were cited supporting Quie's view in the House report (p, 20), 

Chris Cross 5 a former staff member for Congressman Quiej notes that 
the utility of the Sustaining Effects study was mixed from Quie's point of 
view, in that it did supply aimEunition to both sides. Administrators of 
Title I that we interviewed noted that, at least the data Informed the de- 
bate. Moreover j the Sustaining Effects Study and the NIE Compensatory Edu- 
cation Study at least allowed the Office of Education to cite facts to the 
Conunittee^ rather than their feelings or judgmmts. These administrators 
noted that the debate over the scope of Title I will probably continue at one 
level or another for quite a while* 

Several respondents noted that during the course of hearings in 1977 
and 19785 Congressional staff raquested informally that about 10 special 
analyses be performed. These requests were channeled through the Congres- 
sional Research Service to the project monitor at the Office of Evaluation 
and DissKninatlon, who would then request that Systans Developmmt Corporation 
perform the analyses* These special analyses consisted of projections of 
the consequences of changing the formula allocations for school districts. 
It is not clear ttat these analyses Influenced the eventual allocation for- 
mula* People have forgotten this infoinnation over the two years since the 
passage of the amendments* 

The findings on the numbers of children receiviiig services dld^ however, 
cause a nimber of activities within Title I managment. Some 6% of nonpoorj 
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non^low achieving students received Title I services, according to one ana-- 
lysis 5 while 48% of low income, low achieving students were receiving no 
compensatory education serv oes from any source. The causes of these dis- 
parities are often beyond the control of the federal program. However^ ac- 
cording to one of our respondents ^ this Information has provided the Impetus 
and the resources for Title I to devote more attention to student selection. 
New technical assistance is being offered to districts to enable them to 
better select students. This assistance is being offered by the Title I 
Technical Assistance Centars, as well as central qffice personnel. 

Since the toendments of 1978, the Sustaining Effects Study has been 
used in budget allocations in two ways. First, according to two of our 
respondents , the data helped to justify budgets for Title I for fiscal years 
1978 through 1981. The data were also used in Administration responses to 
questions posed by Congressmen during the appropriations hearings. Secondly ^ 
the data were used to check on the accuracy of the Title I administrators' 
estimates of the number of children served by the program. The last data 
that had addressed this question had been collected in the late 1960 's* 
From the Sustaining Effects Study, program managers were able to show that 
their estdjnates were correct. 

One program analyst said that she had used the Sustaining Effects Study 
when she was working on a study of overlap between Title I and handicapped 
services under PL 94-142, The project monitor confirms this. Because there 
seemed to be little overlap (a finding confirmed by the GAO) , plans to 
deal with overlap have been cancelled. In the absence of this information, 
however, needless activity might have ensued. 

An analyst for the Assistant Secretary for Planning and Budget said 
that he was using data from the Sustaining Effects Study to help the Office 
of Civil Rights formulate regulations for bilingual education. The Sustain-- 
ing Effects Study is the only good source of information on the numberg of 
students involved, and therefore the cost of serving than. 

This same analyst said that in the near future he would use Sustaining 
Effects Study data to eKamine the adequacy of Title I regulations for targettini 
services to elligible students, for eKamining the relative effects of home 
and school on achievement, and for understanding the effects of TV on achleve-^ 
ment , 

Some respondents said that the Sustaining Effects Study data ware 
difficult to use or to understand. The tables were cryptic, the rows and 
columns did not necessarily add up. and outside assistance therefore became 
necessary, Onm budget analyst requested aisistance from the project monitor, 
who clarified this information for him. The analyst examlng the overlap 
question sought assistance but was not satisfied even so. This situation 
may be remedied in the final report, which Is to be complated shortly. 

The analyst for the Assltant Secretary for Planning and Budget said 
that because the Sustaining Effects Study's budget wag cut by Congress, data 
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were not collected that would have been Important for questions he was ad- 
flressmg. The Systems Developmait Corporation and the Office of Evaluation 
and Dissemination made the decision that they ought to retain those elements 
of the sample that were of interest to compensatory education. Because of 
this decision and because of attrition, the sample was no longer a representa 
Cive one at the end of 3 years. For the purposes of this analyst, the losses 
of information were considerable. For the purposes of studying compensatory 
education, the decision was unfortunate but It might be araued that it 
was necessary. * a "-"ai, j.u 

Chris Cross has said that Congress was concerned that at the end 
of the Sustaining Effects Study, the Title I program would be greatly changed 
and the relevance of the data therefore limited. He points out that the 
program has Indeed changed a great deal. The Office of Program Evaluation 
has argued that in reality compensatory education programs do not change 
very much, in spite of changes in policy. The effect of Congress' action 
however, was to greatly reduce the general izability and detail of the in=' 
formation the study could provide. 

In developing this case study we interviewed Janice Anderson, the cur- 
rent Project Monitor for the Sustaining Effects Study, Office of Program 
Evaluation; Keith Baker, Social Science Analyst for the Assistant Secretary 
for Planning and Budget; Beatrice Berman, Program Analyst for the Assistant 
Secretary for Planning and Budget; Vincent Bregllo, Executive Vice President 
Decuna Research Corporation; Launor Carter, Vice President, Systems Develop- 
ment Corporation; Chris Cross, former staff member to Congressman Albert 
Quie; James Hubbard, Program Malyst for the Assistant Secretary for Plan- 
ning and Budget; William Lobosco, Education Specialist for Title I; Thomas 
McNamara of the Office of the Assistant Secretary for Planning and Budget- 
Paul Miller, Program Support Branch Chief, Division of Education for the 
Disadvantaged; and George Mayeske , former Project Monitor for the Sustaining 
Effects Study, Office of Evaluation and Dissemination. 
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Case Study: Use of an Evaluation 
of ESAA Nonprofit Orianlzatlons 

_ The Emersency School Assiscance Act of 1972 supports nonprofit organi- 
zations in communities to assist in implementing school desegregation plans. 
The Rand Corporation studied the effectiveness of these nonprofit organi- 
zations and contrasted them with coinEynlty orianlzatlons not funded by ESAA. 
Rand identlflad several problems relating to their effectiveness that have 
led to alterations in the program. Jesse Jordan, Dlractor of Program Opera- 
tions for the Equal Educational Opportunity Program, notes that, "As a 
result of this evaluation, the program vas changed to the extent that it 
Is really a different program now." 

Rand found that the activities of nonprofit organizations were not well 
coordinated with desegregation activities of the districts. Rand also identl- 
tied characteristics of effective, as opposed to Ineffective, nonprofit organ- 
isations, including the use of citizen action strategies such as formation 
ot coalitions with community groups to promote desagregation. OE was funding 
relatively few such nonprofit organizations and had no means, before the 
report, of assessing what made for an effective nonprofit organization. 

According to Jordan, these findings contrlbuced to Congress' decision 
to ranove the funding of such organizations from the state apportionment 

filv^' ^*°8uaift of Section 608 was changed. In the Education Amendments 
of 1978, to fund nonprofit organizations through national competitive grants 
instead. Although the evaluation is not cited in the House report for this 
particular change, it is cited shortly after in another context. The 
change to competitive national grants allows OE to fund those nonprofit 
organizations showing promise of being effective. 

_New regulations dealing with ESAA nonprofit organizations reflect 
Rand s assessment of what factors produce an effective organization. All 
those people Interviewed for this case study agree on this. Moreover, the 
Rand study is cited in response to conmentary on the proposed regulations 
In the Federal Regis te^ for April ll, 1980i 

One commenter asked why experience with other connnunlty organizations 
was considered a relevant criterion [for funding applications]... The 
Rand study of the NPO program indicated that organizations that achieved 
the greatest impact in promoting desegregation were those that utilized 
citizen action strategies such as informing the public on desegregation 
issues or forming coalitions with other comnunity organizations. This 
finding was taken into consideration In developing the criterion, (p. 25029) 

Criteria for applications that reflect the Rand findings Include i involvement 
of conmiunlty merabers in the project; sensitivity to the community and popula- 
tion to be served- and "experience in working effectively with community 
organizations, especially on matters relate:, to school desegregation and 
race relations" (Sections 185,129 and 185.130). 
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The Rand study also found that the close rslationahip between districts 
and nonprofit organisations often resulted In the nonprofit organisations* 
providing compensatory edticatlon services. In the House report for the 
Education AmendmMts of 1978 , this report is cited as showing that "two- 
thirds of ESAA funds are spent on instruction in basic skills," (p. 96) 
The CoTOlttee decided that the anphasls of the nonprofit organizations should 
be desegregation J not Instruction^ and therefore revised the list of actlvl-- 
ties to be funded to focus directly on desegregation* The language of 
Section 608 of PL 95-561 was changed to this effect. 

The Rand study was also used as justification for subsequent changes 
in regulation dealing with the prohibition of comper itory education using 
organization funds. In response to coTimients on the proposed regulations. 
It is noted; 

An evaluation of the NPO program by the Rand Corporation. revealed 
that these activities underinine the effectiveness of the NFO in 
facilitating school desegregation. Moreover , the legislative 
history of amTOdments to the Act made by Pub, L, 95^561 calls into 
question the eaphasls given to these activities in the past and 
indicates a Congressional preference for activities more closely 
related to the desegregation process (Federal Register, April 11, 
1980, p. 25028), ~" "~ 

In 1979, Congress cut the budget for nonprofit organizations by two- 
thirds, an action not sought by the Administration. All those Jjitervlewed 
for this case study agreed that the justification for the cuts was the Rand 
study. The Smate report on appropriations for FY 1980 cites "an HEW study" 
reporting that the majority of organizations were not actively Involved in 
desegregation. The budget for the program mkB cut by $10 million, in order 
that OE would fund only effective organizations. 

In developing this case study we interviewed Lawrence Bussey, Special 
Assistant to the Deputy Assistant Secretary for Equal Educational Opportuni- 
ty; Jesse Jordan, Director of Program Operations p Equal Educational Opportuni- 
ty Program; and Robert York, Project Monitor for the evaluation of ESAA 
Nonprofit Organizations* 
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fase StMdy^ Use of an Eyalyati^n 
of Hagnat Schools 



In 1979 Abs Associatas compieMd an evaluatiou of 01* s magnat schools 
as a tool for dagagragatlon, Tf^^net schools hav& speclallEed curricula and 
other resDur^M that are irit^nasd to attract students to the school and so 
facilitatfi TOluntary dasagrogation ol the cOMunityi Aht Asaociatias intar-- 
viewed anminlstrntOCT ^ t&acherSs paints ^ and comunity groups involved in 
dasegrej^itiDn xn 18 school districcs cGatalning magnat schools. They 
asseSi??erv conditloiis under which magnet schools are likely to be effective 
in de^t-^gregatioin* 

Th^ evaluscion noted that magnet schools are success fu3 in achieving 
desegragation within the gchoo?. and are associated with desegregation to 
tne district as a ^ole, Moifeuvari ^t noted that tha communities had mora 
posltiim attitudes toimr^^ iasagragation after experience with magnet schools* 
Haw^avers the report cautiLoned readers that the findings on district-vide 
desegregation were qualified by several problanSi and that the findings on 
attitudaa were ''at beat, suggestive" (p* 11), 

Nevertheless, these positive findings were cited. Prior to the evalua^ 
tion^ there was a general realisation that magnet schools were an untested 
concept. Because the evaluation was generally favorable, several of the 
people Interviewed for this case study believe that the report put an end 
to debates over the efficacy of magnet schools. This belief is also re^ 
fleeted in the House Report on the Education Amendments of 1978: 

.,,an evaluation of the ESAA magnet schools by Abt Associates 
concluded that in every site visited, people felt these schools 
had a positive effect on community attitudes, (p, 93) 

The S^ate Committee on Appropriations also conmented on the positive 
findings with respect to magnet schools in its 1979 report: 

The Coranittee has included $50^000,000 for Mgnet schools | this 
is an increase of ,$25,000*000 over the 1979 appropriation, "nie 
magnet school is one of the most effective tools for voluntary 
desegregation, A recent evaluation shows that magnet schools 
are an effective tool in helping to Improve conmiunity attitudes 
toward schools, (p. 107) 

However, the Abt evaluation did indicate some problans with the program 
and areas in which Its effectiveness could be increased. The evaluators dlS'- 
covered that OE was airarding magnet school funds to districts with poor 
records of desegregation, and to districts that had little need for desegre^ 
gation assistance. Moreover, they discovered that magnet schools were more 
effective as part of a comprehensive desegregation plant According to 
Monil^ Mrrison, Special Assistant to the Deputy Assistant Secretary for 
Equal Educational Opportunity, these findings confirmed lAat the Adminis- 
tration had already suspected. 
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According to Jesse Jordan, Director of Program Operations of the Equal 
Educational Opportunity Program, the Department of Education is forbidden 
ay law to require that magnet schools be part of an overall desegregation 
plan. The Abt finding that inagnet schools were more affective in such an 
overall plan was brought to the attention of the House of Representatives 
Lommittee on Education and Labor in hearinis in March of 1980 (transcripts 
were not available as of this writing). Accordini to Jordan, Congress will 
probably not change the law, because In spite of the Abt Associates finding. 
It still hopes for results from the magnet school concept alone. 

The Department of Education was able to revise regulations to take into 
account the school district's desegregation record, however. Jesse Jordan 
had primary responsibility for writing new regulations for the ESAA programs. 
Jordan said that he used the Abt findings In changing the rankings of appli-- 
magnet school funds. The new regulations give priority to those 
applications from districts that have achieved reductions In the isolation 
fn The relevant regulation is section 185.104, published 

in the Federal Register for May 16, 1980. 

Funding for the magnet school concept has been rising rapidly at lea=t 
in part because the program is popular with Congress. According to Monika 

cur if?; f "8"-^ ^^^"^ the program should be 

cut In light of the Abt finding that the program was growing out of propor- 
tion to actual desegregation activities. According to Harrison, who assisted 
in the preparation of budgets, the Administration requested reduced funding 
for the program for FY 1981 and a reclsion in funding for FY 1980 (tran- 
^^''a^^ ^ hearings not available as of this writing). Jordan agrees that 
the Abt findings provided part of the Administration's rationale for the 
cuts, but It is not clear whether the report was cited. Jordan notes that 
Congress did not agree to the cuts In funding requested by the Administra- 
W^fifi ^ , ' Congress did reduce the budget for magnet schools 

by ?6 million, but Jordan believes that this was part of an overall budget 
reduction and was not intended to cut magnet school funding £er se. 

Abt Associates noted several reasons to change the resulation that no 
more than 50 percent of magnet schools be minority students. The require- 
ment helped discriminate against districts with large minority populations: 
it demeaned minorities who already felt that the program benefitted whites- 
and more generally, the focus of the program was felt to be the district 
not the school. Both Jesse Jordan and David Lerch, Program Manager for ' 
Magnet Schools, said that these findings were used in changing the regula= 
Clons. The new regulations allow more flexibility by focusing on the dls- 

iSi^fn?"^? P"^ *® * °" school. The relevant section is 

103. lui of the regulations as published in the Federal Register for 
May 16, 1980. - 

In developing this case study, we Interviewed Monika Harrison, Special 
Assistant to the Deputy Assistant Secretary for Equal Educational Opportunltyt 
Jesse Jordan, Director of Program Operations, Equal Educational Opportunity 
Program! David Lerch, Special Projects Branch Chief, Equal Educational 
Opportunity Program,' Eugene Royster, Principal Investigator of the Abt 
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Associates Magnet Schools Evaluationi and Robert Yorkj Project tonltor 

for the Magnat Schools Eyaluationt Office of Program Evaluation. Department 

of Education. 
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Case 



Use of r Survey 
A-TV V-f y, lip 



The Office of Education f^nnde , development of public television 

Acr^ThLft'f -/«-hing . . . the m4^ay School ts^stlncr 

or;eh^i%: 'l-u^^ef ^^'S. -ip Of "'"""^ 

bv AnrsH^A I^J . P °' ^^^^e programs tos conducted 

of thf cMldr^n n^^"- ^''f ^SM^TV serles . Less than 102 

shown at thf^" preferred t.e ESA.. programs over other programs being 

of^duLti^nal'w^" P""-* typical 

gramr esIa ?J haf -^"pared to other f,uch public television pro- 

grama, t,bAA-TV had achieved raaaonable vlewership. 

rh., ^i'^ program people we inter^/iewed and the project monitor agreed 

^aWnlrtL^M^f ^-"-^'^ ""'"^ P^blems with view- 

alreadv Vh^ the survey was completed. Some management changes were 

already under way at the time the survey was completed, while othefs were 
had^'eirief to'i'"' -re later implemented, fol example, a subcontrLt 
of thH ? ^^"""^ Vlewership. A contract raa let after completion 

of^the survey for converting the prograns to school use, rather ^an home 

on de^^nn^^^r f ^ ^i^^"'8h±p was that the program had concentrated 
on development of series, rather than promotion of them, according to the 

S«S T^el' f f^' ^" P-"-^ vfewerehlp 

Sat it if monitor, when asked why the survey ^s conducted, said 

were nrohl. 5 J! determine exactly ^ managers had realized there 
were problems. He said that the study was closely coordinated with the 
Bianagers • 

Because changes In the program geared to Increasing vlewership are 
under way, the data collected in the survey may become obsolete. Tlie nrin- 

s^atiaif u- ^ - -"^^ °^ ^^^-^ administration, funding, and local 

stations carrying the programs, noted that the vlewership situation has 
changed radically because of these changes. Therefe.e, her group may per- 
form secondary analysis of the survey data to supplement their own report 
but recognize that it is dated. ^fyt*-. 

Although the report was intended primarily for the managers of the 
program, it was used by the Department of Education, according to Monika 
Harrison. Special Assistant to the Deputy Assistant for Equal Iducational 
Opportunity, in budget requests. Harrison assisted in preparation of the 
budget proposals for the program In 1979. Because the data showed the 
necessity for promotion Harrison believes that they allowed the Administra- 
tion to argue cogently for funds to promote the series, in hearings before 
stuff Congressional hearings make no mention of the 

L??^" S^^^^l' distinctly possible that this use occurred infor- 

mally, within the administration. 

Jesse Jordan, Director of Program Operations of the Equal Educational 
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Opportunity Program, notes a uie of the survay In ragulatlonSi although 
again the Inforoatlon corresponded to tdiat was already taiown ahout the 
program, Jordan says that he used the survey results in writing new regu- 
lations that changed the funding for producing tlie television series from 
grants to contracts. This statement Is corroborated by others* This 
arrangMent gives the DepartmMt of Education more control over what kinds 
of progrMis will be produced 5 thus enhancing vlewership* The relevant 
sections of the new regulations are 185*150 through 185 • 155 as published 
in the Federal Register for Hay 16, 1980, In response to commentary on 
the propoaed regulations, the following rationale is given i 

These requirenents are based on progrm experience In managing the edu^ 
cational television program authorised by the statute as originally 
enacted J are necessary to implCTent the statutory mandate that "programs 
... be made reasonably available for transmission, free of charge^ and 
shall not be transmitted under coMiercial sponsorship* "(p. 32654) 

In developing this case study we Interviewed David lertaan, ^slstant 
Dean in Charge of TeleconmunicationSs Newhouse School of Public Communication 
and former Director of the ESAA-TV programi Malcolm Davis, Director, Division 
of Educational Technology and director of the programi Monika Harrison, Spe^ 
clal Assistant to the Deputy Assistant Secretary for Equal Eiiucational Oppor-- 
tunityi Jesse Jordan, Director of Progrra C^erations, Equal Educational Op-^ 
portunlty Program; Arthur Kirschenbam, Project Monitor for the ESAA-TV 
survey. Office of Progrm Evaluation; Bernadette Nelson, Principal Investi- 
gator for Part II of the ESM-TV study, curraitly being conducted by Abt 
Associates; and ^ne Kuchak, Vice President, Applied ManagCTmt Sciences * 
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Case Study I Use of an Eyaluatlon 
Of Unplementstion of Project Information Packages 

This study, conducted by ^arlcan Institutes for Research assessed 
that innovative. rtucatlan projects are Implenent^ at sltS other than 
dls«Lf fi^^ -""^ P"'^*"" developed. The evaluators found that the 

dlssOTlnatlon strategy .and the Instructional packages created to facilitate 
Implementation were moderately useful. Howevir. the neSy implLented 

inno^atlon-inlnr'rthelltea 

Sd Srahout th'' " challenging assumptions pSplf''' 

^tlA ht f I P"Sram. One assumption was that the whole program 
vaSon the rather than parts. Adopters tended to take from In inno- 

^ ^ r appealed to thm. The briefing also challenged 

the idea cnat adopters would eagerly seek advice from the developer The 

Sm^^hfLl^^'fl '"'^'^ obtain adoptions that did not differ n^taS^ 
Ei-om the original. Incentives must be created. 

tat<o^%*°?""^l^°r^ packages developed to facilitate program implemen- 
ment^erliS^ ImpUmentation Packages, originally represented a Lnage- 
ment experiment, according to several of the people we interviewed Thev 

Ss e::ii°?ion'Lroth '° '^^^ "eieted ~^eS2lon. 

i^;?^ research convinced some toanagers that other 

othfr- Otliars maintain that they believed all along thPt 

other actions were necessary and the evaluation strengthened their blllef 



-i J, — ^^^^^^^^^^^^ agi^su uu^t supyequanc changes In the 

vatlonf ff ^'^^^^^ organisational vehicle for making the lnno= 
vatlons available, were at least In part attributable to this study. First 
developers of the Innovative projects were funded to give Indlviduii assis!' 
tance to project adopters. Second. 01 contracted to provide ilvejfpsrs^th 
technical assistance to more effectively disseminate their projects becSse 
^LoreStively? co^unLaJelhislnlo?; 

modification of the program reflects concern over the extent 
to which adoptions of the projects are similar to the original. The Off ice 
of Dissemination and Replication is currently att«ptlag to detemlne crl! 
f2 , I the quality of adoptions. New regulations published in the 
Federal Register for April 21, 1980. require that every four years devel- 
opers present evaluation ^idence of the quality of adoptions in order to 
IWae^ relevant sections of the regulations are 193.12 and 

th«^ ^T^li^ onagers In the Office of Disienlnatlon and Replication noted 
^ t ^"^^ y^*" °' National Diffusion Network, primary atten- 

tion had been given to increasing the acceptance of the program and gaining 
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as many adoptions as possible* The evaluations of the Project Infomation 
Packages were part of i watershed for the Network, They are now trying to 
change the presumptions of adopting sites and to persuade th^ to accept 
the idea of evaluation as an integral part of the projacts. A contract to 
produce an evaluation guide to assess the quality of adoptions has been 
made with the Center for the Study of Evaluation at UCLA and has been under 
davelopment for a year* 

The CEIS contact person for the study of ^roject Information Packages 
was contacted for this case study, in the 'opinion of this Independent 
observer, the study was well don^ technically given the constraints of 
field research. However ^ this individual noted that timeliness of the 
report had been something of a problCT* The negotiations between AIR and 
OE over content and methodology were to some extent responsible* As a 
reviewer, he wanted more data to support the descriptive conclusions of 
the study. AIR followed his suggestions on redrafts of the report* 

In addition, this individual noted that OE, in conducttng the series 
of Project Information Packages studies, had not made full use of existing 
literature on the subject of adoption of Innovations* For example^ he 
cited one suggestion of an evaluator, that mplCTentatlon of a new project 
might be broken down into project elements, rather than whole programs. 
However, this idea had been current in the literature for many years* In 
fairness to OE, however , he noted that one has to try such ideas to find 
out if they will work* 

In developing this case study we interviewed Will Ashniore, Program 
Evaluation and Planning Consultant to the Wisconsin Department of Public 
Instruction; Anne BeEdek of the Office of Program Evaluation | Judy Burnes 
of the Office of Program Evaluation and project monitor for the evaluation; 
P^ggy Cmpeau, principal investigator for the AIR study; Andrew Lebby 
of the Office of Dissemination and Replication; Louig Walker of the Office 
of Dissemination and Replication; and Lee Wickline, Director of the Office 
of Dissemination and Replication. 
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Case Study: Use of a 
Search for Exemplary Career Education Projects 

The Office of Education contracted with American Institutes for Research 
to conduct a search for effective projects in career education. A seconf 
evSati^ f" ^° identify ways of inprovini career education 

evaluations. AIR asked knowledgeable Individuals to nominate outstandinR 
career education projects. Ten of these passed AIR's criteria of evidence 
of effectiveness. These projects were visited In order to verify the in- 
formation they had provided, and in order to collect additional data. AIR 
T^r^^^^^^' docimentation to present each project for submission to the 
Joint Dissanmation Review Panel (JDRP) of the Office of Education snd NIE. 

, reviewed all ten projects and Judged seven as having proven 

their effectiveness. These seven projects thus became eligible for grants 
as developer/demonstrators in the National Diffusion Network. Six of the 
projects did apply for such grants. 

Evidence of use of the projects consists of the numbers of adoptions 
of the projects by other sites, and the number of inquiries received by the 
developer/demonstrators. We were able to contact four of these projects for 
information. The ranalning two were closed for the sunmer. We also asked 
pasftL^JDSv"^^ '''^^^ °^ technical assistance In helping th«n 

^r^u" Arkansas provides career awareness information 

f^r^afL curriculum. It was approved for dissemination 

f lofo through 8. They have been funded for dissmination since October 
J* r/ - """-''^ °^ writing. Thirty-five sites In 8 states 

have adopted their project. They receive an average of 3 or 6 inquiries a 
week. If they had not been passed by the JDRP, S sites would nevertheless 
have adopted their project. The director of the project said, "AIR was 

^ " develop new projects we will ask thra to help us evaluate 

than. She mentioned that there were already plans afoot to submit the pro- 
ject, but without AIR'S assistance, "it would have died In the water " For 
this project, therefore, 30 adoptions In 9 months can be attributed to AIR. 

Mo Development Programs, of Akron, Ohio, also uses career educa- 

tion activities as part of the ongoing curriculum Thm t,^« ^ eauca 
for grades K through 10. Two yeafs after approval ^t^W^ t^!^' 
10 agreements to adopt the project that are in various stage, -f topl^Inta- 

n°32 statis While" f " lSo'school1?it"S: 
in 32 states. While using project materials does not constitute an adoption 
it may constitute use of project Information at some level. They havfmade' 
formal presentations to 322 districts who inquired al^out the project The 
director had a positive attitude toward the ilR asaistance he received. 

r^i^f Project CERES (Career Education Responsive to Every Student) of Ceres 
Sltiln tf «"d««s' decision-making skills and attitudes toward work 

hav^ Ln ^"f" curriculum. Funded since October of 1979 they 

have had 17 adoptions and approximately 4,500 inquiries by letter and phone. 
The director said, "Honestly, we would not have made it through the JDrI without 
AIR. We would never have actively solicited adoptions. We wouldn't have even 
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tried. The technical assistance was worth it." Here are 17 more adoptiona 
that would not have occurred without AIR*s work. 

Project MATOi (Matching Attitudes and Talents to Caraar Horizons) 
of Ontario California Infuses career education into the regular school 
curriculum for grades K through 8^ with a component for staff develop- 
ment and self =evaluat ion. Two years after JDRP approval | they have 24 for^ 
mal adopt ions J plus 20 adoption agreements at various stages of Implementation* 
Eighty sites have purchased the materials for their program. They receive 
about 1000 jUiquiries per year. The ddjcector said that the AIR activites 
provided thm with a catalyst for submitting their project to the JDRP, some- 
thing they might not otherwise have takm the time to do. He said that 
AIR had a good track record in screening projects for effectiveness* From 
this project therefore we have 44 adoptions that in all probability would 
not have occurred without ATR*s assistance. 

Another product of the AIR report was a monograph ^ published for the 
Office of Career Education^ on getting JDRP approval for career education 
projects* This monograph is in the process of publication , so that people 
have not had a chance to use it yet. 

A program analyst for the Office of Dissemination and Replications , 
which administers the National Diffusion Network^ said that at first the 
AIR study had suffered from a typical problem of technical assistance to 
projectsi a feelJjng of non=tavolvment on the part of the developers. How-- 
everj the developers did begin to feel involved as the study progressed, lie 
believes that the projects function reasonably well, A factor that has great- 
ly assisted the dissemtaation of the projects, accordtag to this analyst, 
has been the transfer of funds for dissanination from the Office of Career 
Mucation to his own office. 

In developing this case study we d^terviewed Darvel Allred, Project 
Specialist, Ontario-Mont clair School District, Ontario California; Jack 
Hamilton, Principal Investigator for the AIR study of exemplary career 
education projects; Nancy Keenan of the Office of Career Educationi Mdrew 
Lebby of the Office of Dissmination and Replicationsi Jeanne Leffler, 
Director, Project CAP, Greenland Arkansas; Virginia Lish, Curriculum Speciall 
Ceres School District, Ceres, California; Sejroour Rubak, Executive Secretary 
for the Joint Dissemination ReviCT Panel and Office of Program Evaluatloni 
and Nick Topougis, Director of Career Education Programs, Aioron, Ohio* 
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Case Study: Use of a Survey of Campus-based Aid 

A survey of the beneficiaries of four postsecondary student aid programs 
was conducted by Applied Managenient Sciences. These proirams werei Basic 
Educational Opportunity Grants (BEOG) ; Supplemental Educational Opportunity 
Grantsi National Direct Student Loans; and College Work-study. Part one of 
this study examined existing records and interviewed manaiers and the federal 
and regional levels. Part two involved the collection of prtoary data. 

According to several people we interviewed, this study was useful 
largely because it provided analyses of students benefitting from the program 
by race, sex, and other background characteristics essential to policy analysis 
Other surveys and program data have produced information on characteristics of 
those served, but the data have been limited in their usefulfnesa by the fact 
that they involved only one program, or only certain institutions. The current 
leJir examination of funding of all kinds at the individual student 

_ The Director of Quality Assurance of the Office of Student Financial 
Assistance believes that the survey is one of Che better studies undertaken 
by the postsecondary education evaluation group in the Office of Program 
Evaluation. He notes that its usefulness depends heavily on its compre- 
hensiveness He maintains that he needs this comprehensive information for 
long range planning, for determining whether the funds are going to the people 
that policy intends, and for answeriist questions about funds that go to par- 
ticular groups or particular inatM^bions . 

The data on student' ■characteristics was shared with several private 
groups interested in postsecondary education. One of these was the American 
Council on Education. A policy analyst for ACE said chat she produced some 
statistical analyses of the survey data which were used by her group in 
developing their own policy position on campus-based aid. In particular 
ACE was concerned with the half-cost provision for funding under the Basic 
Educational Opportunity Grants. Under this limitation, a student's Basic 
Grant may not exceed 50Z of the cost of this or her attendance. The survey 
gave ACE support for their contention that this provision was hitting low 
income students the hardest. The position taken by ACE is important because, 
together with other groups interested in postsecondary education, they produced 
tor the House Sub committee on Postsecondary Education a proposal to alter 
BEOG which was, according to the House Committes report, "substantially incor- 
porated into H.R. 5192 [the House bill]". (p. 18). This proposal represented 
a compromise among these groups which over the course o£ several years raised 
the amount of money a student could receive under BEOG. In addition the 
percentage of the cost of education that could be covered by BEOG was to 
Increase to a maximum of 75% in 1985. As of this writing, the proposed 
changes m BEOG have not yet become law. 
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The former project monitor for the Survey said that he had reiponded 
to several Informal requests from Congressional ataff for analyses of the 
data by raee and sex of the particlpantg* At the ttae, this project inonltor 
was assigned part tine to the Deputy Coinmissloner *s Office to develop back-' 
ground information for the reauthorization hear togs on higher education. 

In developing this case study, we interviewed Ernst Becker, Director 
of Quality Assurance, Office of Student Financial Assistance | Dr* Salvatore 
Corallo, Director of the Division of Post secondary Progr^s, Office of Pro- 
gtm. Evaluat4©n| Alexander totnofsky, former Project Monitor for the study 
of Campus Based Ald| and Patricia Smith* Associate Director of Policy Malysie 
Services, Merican Council on Education. Time did not permit obtaining 
corroborative testimony from Congressional support staff. 
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Case Study s Use of the Vocational 
Iducation Equity Study 

The Education Mandmanta of 1976 mandated a study of the extent of 
sex dlscrtoination and sex stereotyping of vocational education programs 
supported under the Vocational Education Act and an assessment of process 
in overcoming m,ch discrimination. Merlcan Institutes for Research carried 
t^L nJf . T , of 1978. They visited state Vocational Educa- 

? 1 * Urge sample of high schools, vocational schools and tech- 

nological mscltutCB, and conmunlty/ junior colleges. The study showed that 
students continued to be concentrated In classes stereotyped as •'appropriate" 
for their sex. The teachers of vocational education were overwhelmingly 
concentrated In classes stereotyped as "appropriate" for them to teach. 

""'^^ tr^^ Immediate interest from public groups of all kinds, 
including groups that monitor educational equity, employerB. labor unions, 
state and local commissions on the status of women, vocational education 

i'-v?' 1 education associations, and even the Army White 

Sands Mlssle Range. The staff member of the public relations office of 
the Department of Education assigned to distribute information about the 
study says that it has generated more Interest than any other report she 
has seen since she has been In public relations. After distributing over 
Jau copies of the executive summary, the Public Relations Office is on 
Its second printing. 

*. ^fsof the Information has by and large been confined to 

that of background Information. We called a sample of the groups inquiring 
about the study. These included the NAACP Legal Defense Fund- ACLU of 
Georgia; Project Opportunity, a group sponsored Jointly by the Center for 
Women and Work and the Coalition of Labor Union Women/and the National Ad- 
visory Council on Women's Educational Equity. All these groups said that 
the study had provided Important background Information. However, they 
had not yet used the information for any specific decisions. 

an^h '"^^ ^® heavily for decisions in the near future. Re- 

authorization hearings for the Vocational Education Act will take place in 1981 

materials foi'tf""".' '^^'^ developing position papers and 

materials for these hearings that cite the Vocational Education EqulL Study 
extensively These Include the National Advisory Council on Women's Educational 

therf if UttL oolltf f "'""^ vocational Education. Fofthf moment, 

cnere is little political action to be taken. 

F^^ ^li^^ sex equity coordinators may be making use of the report. 

For example, the coordinator for the State of Iowa says that the report 
has been extremely useful" In giving direction to state eemity efforts 
Ire de^rihld'?* P""i%i"8 approaches" to the elimination of sex bias ihat 
fnH ^" the evaluation have shown Iowa that such projects are feasible 

and the state is funding projects of this kind for the first time this ySr ' 
The state coordinator also distributed the executive summary to provide guidance 
to the first meeting of the state sex equity council. proviae guidance 
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Two of our contracts expressed serious reservations about the 
way that the study was handled. Both the Special Advisor on Women's Issues 
of the Office of Vocational and Adult Educationp and the l&cecutive Director 
of the National Advisory Council on Womm-s Educational Equity matataln that 
the conclusions of the report were misinterpreted. Many people miarMd the re- 
port and believe that it evaluates the sex equity coordinators* However ^ 
the legislation authorizing sex equity coordinators was passed in October 
of 1977, only three months before AIR began to collect evaluation data. 
In fact, the report cautions readers that fully t^lf of the coordinators 
were very new to the job* However^ this information is easily overlooked. 
Because the report concludes that as of 1978 vocational education was still 
sex-stereotyped, it gives the unjust appearance that sex equity coordljiators 
are not being effective. 

The Special Advisor to the Office of Vocational and Adult Education 
mentioned^ for example, tlmt one of her colleagues heard the evaluation cited 
as evidence against sex equity coordinators in a meeting of the National 
Coimlssion on Mploymait Policy, According to the Special Advisor, many 
state coordinators are concerned over such mlslnterpretf tions of the report. 
'The Special Advisor noted that the state coordtaators would, in fact, welcome 
an evaluation of their efforts, now that their positions have been In ^Istencs 
for three years* 

The Special Advisor and the Executive Director of the National Advisoty 
Council noted other problems related to utilisation. The clearance process 
for the report was exceptionally long. The evaluators finished collecting 
data in spring of 1978, and the report was only released to February of 
1980. The fact that the study has only been available for six months as 
of this writing may well have affected the degree of use to which it has been 
put* Dissentaation of the report has been hindered by its cost, which is 
$35 for the complete set of volumes. Although the Special Advisor regards 
as useful the section of the report on "promising approaches", the acecutlve 
Director of the Council cannot recall any of thm being adopted by people 
with whom she is acquaint^* 

In creating this case study we Interviewed Barbara Bitters, Special 
Advisor on Women's Issues of the Office of Vocational and Adult Educatloni 
Beverly Gillette, Non-sexist Vocational Education tonsultant (equivalent to 
a sex--equity coordinator) for the Department of Public Instruction of the 
State of Iowa; Laurie Harrison, Principal Investigator for the AIR study 
of Equity In Vocational Education* Oiarlotte Hoffman of the Public Relations 
Mflcei Virginia Looney, Coordinator of the Vocational Education Monltortog 
Project, ACLU of Georgia; Phyllis McClure of the NAACP Legal DefMse Ftmd; 
Shirley Robock of Project Opportunity; Dorothy Schuler, Project Monitor, 
Office of Program Valuation; and Joy Slmonson, Executive Director of the 
National Advisory Coimcil on Women's Educational Equity, 
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Case Study; Use 

of an Evaluation of Upward Bound 

Research Triangle Institute conducted an evaluation of Upward Bound that 
followed the progress of students through high school and entry Into college. 
Anong the major results were the following, students were more likely to enter 
college if they had participated in the program for two or three yearsi of 
those students not Immediately entering college, a much larger percentage of 
the Upward Bound students entered college within 3 years- and Upward Bound 
increased the number of minority and poverty students in college. This evalu- 
ation is the second follow-up study of an evaluation originally conducfcftd in 
1976, but is the first study that contains detailed information about th« Impact 
of the program. 

All federal agency staff that we Interviewed maintained that a major us** 
of the evaluations occurred each time Congress appropriated money for the pro- 
gram. One analyst involved in the preparation of the program budget for the 
Department of Education noted that the positive evaluationii helped show Congress 
has always appropriated more money for the program than the Administration re- 
quests. Moreover, it is likely that the evaluations of Upward Bound have had 
something to do with these decisions. After being held constant for 4 years 
funding for the program has increased steadily since the first evaluation of 
the program was released in 1976. However, apart from the opinions of the 
federal agency staffers, we have been unable to establish any direct link 
between the results of the evaluation and Congressional budget decisions. 

The Education Amendments of 1980 will probably change the current eligi- 
bility criteria for Upward Bound to 150% of the Orshansky index of poverty 
Students with no family tradition of college attendance, regardless of their 
poverty status, will be allowed to participate as well. The evaluations of 
Upward Bound probably Influenced these decisions Indirectly. All students 
in the program were found in the study to need Upward Bound services, but only 
three fourths of students in the study met the current poverty criterion. 
Lobbying groups in post-secondary education were, according to the project 
monitor, able to use this information as support for their position on changes 
In Upward Bound that will probably be adopted. 

Upward Bound staff are now writing regulations for thu program In antici- 
pation of changes In the law. Criteria for funding Upward Bound projects will 
probably Include the prior performance of the projects. Prior performance will 
be evaluated In terms of i persistence of students In higher education} pro- 
portion of students completing high school; and proportion of students entering 
postsecondary education. These criteria have been demonstrated as measurable 
by the evaluation, so that the regulations writers view than as feasible. Two 
of the program people we interviewed confirmed this use of the evaluations. 

The first evaluation of Upward Bound was also used in regulation writing 
according to the project monitor. This study had found that two or more years 
of Upward Bound are necessary for the program to make a difference. This finding 
was used m writing the current regulation that senior in high school may not " 
be admitted to the program as beginning cliaits. 
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In developing this case study we Interylawed Dannis Carroll » Project 
Monitor for the evaluation of Upward Bounds Office of Program Ivaluatlonj 
Salvatorre Corallo, Director^ Division of Postsecondary Programs, Office 
of ProgrMi Evaluation} Jmes Herbert, Progrm Analyst for Higher Education 
Programs, Office of Planning and Budget • Shelly Laverty, ProgrOT Mnlyst^ 
Office of Postsecondary Educationi and Velma Montero, staff mmber, Uprard 
Bound Progr^* We did not have sufficient time to corroborate this testdiaony 
with information from outside the fedwal agmcies, e.g*. Congressional support 
staff, aside from CBO (see the case study on the Congressional Budget Office). 
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Case Study: Use of an Evaluation 
of Serviees in Indian Education 

unH^Af^n^^!^''; ^V^^ examined projects at over 200 site-s that were funded 
under Part A of the Indian Education Act. Projects were most frequently di- 
r^!5ti f instruction of cultural heritage and native, language. rLedial 
reading Improv^ent of self -concept , and remedial mathematics. Successful 
implenentation of projects was found to be facilitated by the size of the 
grant, type of objectives, parent involvanent and danslty cf Indian population. 

fi„H^^^"'fI-? t° « planning officer for the Office of Indian Education, the 
11a f the study provided evidence for several assumptions that the Office 

had held for a long tine. These included the importance of parental involvement 

fundw^r'fi"' ^""^ ^^"^ '"^^ P"J«*^^ ^^°^^^y show effecL if 

funding Is below a certain level. Because school districtE are funded on the 

the ' ? " attending, low funding levels may interfere with 

t^success of some programs. They are therefore trying tc persuade Congressional 
Committees to increase appropriations to allow for Increased per pupil expen- 
ditures. This officer did not cite any hearings, however. 

This same planning officer found that information on funding patterns was 

Indian childre h ''""^^^ ''"^ higher concentrations of 

Indian children have a greater need for services. This is true because Indian 

dlstrictf haf t^^eio., represent a financial burden ?or 
J^dlli EducatLr.%. redressed through these programs. Judith Baker says that 
for Pv iSr ? ^^^^ ^"^8" submission to 0MB and Congress 

^nLfL ^ entitled. "Justification of appropriations estimates for 

committees on appropriations. FY 1981. Indian Education. Department of Education " 

fveL a'r* S'"'""' "P^" °" ^"8^ " ^^^'^ Ihis occurred ab^ut 

the sLcSi. t^-t f material did help than by providing statistics - 
tna specifics that showed they were well prepared. 

Both the planning officer and the Deputy Assistant Secretary for Indian 

m ?hf stud^dld f'^f ' '"^f ^^'^ particularly helpful. 

All the study did. in the words of the Deputy, was "list wUt was out there." 

S nrn^'f ^ ^ preferred an Impact study or one whicl showed which types 

of projects were effective for Indian students. There are currently plans for 
such a study in the Office of Program Evaluation. 

SeeriL'^^r^°?^Sf ^^if ""^^ interviewed Judith Baker, Deputy Assistant 

iTr l^ "" ff Education; Emmet Fleming, Project Monitor for the Study 

offset off iefnrr Evaluationi Patricia Matthews. Planning 

for the'studv n/f Indian Education; and Thomas MuUowney. Principal Investigator 
for the Study of Indian Education. Coranunlcations Technology Corporation. We 

f« rnn» ^ 1 " corroborate this information using other sources, such 

as Congressional eupport ataff . su^n 
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FOOTNOTES 
(Chapter 6) 



1. The full citation to the published reports of authors cited in 
this chapter are given in the reference list. Wiere epecifl- 
cation Is neceasary, we uee footnotes to Identify particular 
articles in the reference list* 

2. See respectively, David (1978), antick et al, (1977) on the AIR 
Studies, Lyons, et al* (1978) for reports of UCLA's Center for 
the Study of Evaluation, Alkin et al. (1974), Mkln et al* (1979), 
Barman and JfcLaughlin et al* (1977) on the Rand Study and Datta's 

(1979) critique, and Kennedy et al, (1979) for Info^tion on the 
Huron Institute Study. See Holley (1980) and Webster and Stuff lebaara 

(1980) on thm Austin's research. 

3. See respectively Emrlck et al. (1977), Mitchell (1980), Bissel (1979) , 
and Hope Associates (1979) , 

4. See the series of Annual Evaluation Reports Issued by the U.S* Office 
Education (1976-1979), U.S. Senate, Comilttee on Hiaiian Resources 
(1978), and U.S. House of Representatives, Committee on Labor and 
Education (1978, 1979)* The survey results are given in Florlo 

et al. (1980), Fok (1977), md Weinberg (1979). Case studies are 
reported by Datta (1976) and Millsap (1978)* 

St Claims made in this paragraph are based on corroborated interviews. 

6. See Mitchell (1980), for Instance. 

7. See Sproull and Zubrow (1980). 

8. Tallmadge (1977b, 1977c). 
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7 . RECOMMENDATIONS 

Recommendations to the Congress ire discussed in Section 7.1, those 
Intended for the Department of Education are glvan in 7.2, along with a 
brief rationale for each. Section 7.3 contains an extended description 
of the rationale for selected reconmendatlons. 

7.1 RECOffflENDATIONS TO THE CONCRESS 
Planning and Executing Evaluations 

We reconnend that the Congress direct the relevant staff of Congress- 
ional committees, GAO, and CBO to meet regularly with evaluation staff of 
Che Department to: 

. reach agreement about when particular evaluations are warranted 
and the senses In which each evaluation required by law is possible. 

. clarify Congressional Information needs, quality of evidence 
required, and planning cycle for each major evaluation required 
by law, ^ 

. Identify specific committees and groups as audiences for 
evaluation results. 

. identify the changes in program or understanding which could 
occur on the basis of alternative findings. 

„ ^Jis recommendation hinges partly on the fact that a statutory demand 
for evaluation^ is ambiguous. The word can imply any activity from leumalistle 
reporting to full-blown field experinents dedicated to estimating the efScts 
n/f if 1?" cWldren. The Involvement of multiple interest groups Is 

often necessary, but complicates matters. At worst, general demands " to evaluate 
obscure the fact that feasibility of evaluation varies enormously and that elab- 
orate evaluation may be unnecessary. Periodic efforts have been made by members 
Of the Congressional staff to assure that production of evaluations coincides 
with authorization cycles and that Congressional needs "are understood. The 
process is less regular and less orderly than It ought to be. 

Statutory Provisions for Evaluation 

u"®/®^°"™^"^ ^" constructing statutory provisions for evaluation 

that the Congress; 

. specify exactly which questions ought to be addressed and the 
audiences to whom results should be addressed. 

. provide for formal assessment of the evaluabllity of the 
relevant program where specification is not possible. 

. provide for statistically valid field testing of proposed 
evaluation requirements where specification is not possible 
and in-house assessmait insufficient. 
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Though statutes are explicit about routine reporting requirements ^ 
references to evaluation often are not specific. The simple requirOTent 
to evaluate whether the program meets objeetives of the statute is common 
and vague* Hearings are often not Informative, Defining evaluation re-^ 
qulraments in terms of the questions which should be addressed is s^alble 
so long as the questions themselves make sense, answering them is feasible^ 
and the answers are likely to be useful* The specification of audimcas, 
especially particular coiranlttees or Congressional support agsncies, should 
enhance usefulness* We recognize that expllcitness is often not feasible 
or desirable* Consequently, we suggest formal investigation of evaluabllity 
to clarify questions j audiences, and the ways in which results can be used, 
within a y^r after enactment of a dCTiand for evaluation* 

Evaluator Capabilities 

We reco^end that 

. capabilities be assessed before new statutory evaluation 
requiremmts are directed at LEAs and SEAs to determine 
where resources are adequate to meet the danand 

. expansion of training or technical assistance when the 
demands are notable and capabilities low 

, explore the feasibility and desirability of direct contracts 
programs to capita] Ize on LEA and SEA capabilities. 

The first recommendation mtrntm from conclusions that no real standard 
for assigning the title "evaluator" exists. Skills required of the evaluator 
depend heavily on nature of the evaluation demand and on LEA arid SEA interest 
in evaluation* The second reconmendation is based on the finding that most 
LEAs and SEAs need assistance when the deMnd is high and want it, A small 
minority of LEAs have strong evaluation units. But these are a major resource 
and we believe that direct grant opportunities should be expanded to capitalize 
on them* 

Use of and Authority for Better Evaluation Designs 
We recoiranend that the Congress i 

1- 

* routinely consider pilot testing every new program^ variations 
on existing programs, and program components before they are 
adopted at the national level, using high quality evaluation 
designs , 

* authorize the Secretary explicitly in each evaluation statute 
to use high quality designs, especially randomlEed field 
experiments, for planning and evaluating new program componenta, 
program variations, and new programs. 
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.,--1 "^^f rationale for the first recommendation Is that higher qualitv 

llvaf lattern'r '^^'''^ la adopted at the'^^Lnal 

tl llL ? ^^^^Sns can be employed and conclusions then are Ukely to 

be less ambiguousi political-Institutional constraints are likely to be 

stages are a pilot for later ones. We streia formal teats of new program 
oTTo^ and new variations here because such evaluations are not a matter 
of common practice. We will not learn how to bring about clear, detectable 
changes without more conscientious tests. aetectaoie 

The second recommendation stems from our conclusion, based on this and 
other research, that better designs must be ^s«i if the congress l^tH 
Department wants good estimates of the effects of programnrchUdren! We 
catfru^dertJ' f fose effects in all cases' The process is compile 

cated under the best of conditions, despite cavalier announcements that the 

beca?:rtSy wenfS' 'f'f " f^' "P" ^^^at It was^successful 

A, — advocate explicit authority In statutes for hieh 

SeHevrSicrstatuJ' eKperlme„ts to facLitate their usL 

K ? f explicit statutory provision is essential because such deslens are 

jn/s 

Critique and Reanalysls of Evaluation Results 

pro«aas"tSf c"" ^ statutory requirements for evaluation of major 

programs J the Congress i - 

. also require Independent, balanced, and competent critique 
of evaluation results that are material to policy decisions. 

. "quire critique of jagles of evaluations submitted by 
LEAs and SEAs In response to legal requirements 

* that statistical data produced by national evaluations 

be made available for reanalysls. 

about ^whefLr^^^ 1 not mean adverse commentary. We mean reasoned judgments 
decisions The "1"' "^ evaluation are sensible and can inform 

decisions. The main reason for the recomandation is that such criticism is 
not routine, but it Is essential to enhance credibility of good IvaLatlons 
to properly Identify poor evaluations as such, and to provide feedback to 
woJk Thir f " r"",' =«'"«=tors, and grantees about the quality of their 
ZltnJ t ^ll "° 5°™^^ ^y*-^ competent critique of evaluation reports 
criticism^ " '° ^^^y benefit frS " 



o 310 
ERIC 



7-4 



Use of Evaltrntlon Results 

We reeoMiend that the Congress i 

• direct staff of relevant aomlttees^ the Department, and 

the GAO to routinely outltee which toatltutioni can reasonably 
be ^pected to use results of each major evaluation and how 
such results fflight be used^ during the design stage of every 
major program evaluation^ 

* specify ecactly which evaluatlOTS have been used and why they 
were us^, which have not been used and why they were not 
used^ in authorizations and approp:E|atlons committee reports. 

. require specific information about changes resulting from 
evaluation, whoever the law requires SlAs to describe uses 
of evaluation, 

. explore the feasibility o£ direct competitive grants and 
contracts progr^s focused on improving the use of results 
at the LEA and SEA level. 

The first recommendation's origijis lie In the absence of any mechanism 
for planning use at the national level, Stoply put, unless specific user 
groups are identifl^ and some decision options laid out, evaluation results 
are less likely to be used. Indeed, If thtte Is no clear way to link the 
evaluation with decisions or considerably better understanding, one can 
argue that the evaluation shouldn't be done at all* Specifying expectations 
will also help to make it easier to track utUlEatlon and that in turn will 
help to inform judgments about how evaluation resources could be better allocated. 
The recoBBnendations to cite useful and useless evaluations in federal reports and 
to require SEAs and LEAs to record specific changes have the same objectives: 
understanding use better In the interest ©f better resource allocation. The 
suggestion to identify useless evaluation is not an Invitation to criticise 
arbitrarily* We found that some LEAs and SEAs are capable and interested In 
inventing and testing better ways to use .^formation* The suggestion to ex- 
pand their opportunities for doiiig so Is based on this* 

Standards and Guideline 

Recently developed standards and guidelines for evaluation are not apr 
proprlate for incorporation into law. They are sufficiently well developed 
to recommend that the Congress i 

. use such guidelines to understand what am reasonably be 
expected of evaluations. 

. direct that agencies use than as a guide where appropriate 
to developing criteria for judging evaluation plans submitted 
by LEAs and SEAs. 

. elicit assistance in the interpretation of guidelines from 
Congressional support agencies, such as GAO, that have been 
Instrumental in their constructlcm. 
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7.2 RECOMMENDATIONS TO THE DEPARTMENT 



Authority for Teehnleal Dlseusalon 

We recommend that the Department: 

. authorize technical staff of evaluation units to initiate 
discussion of evaluation plans with pertinent Congressional 
l^^ff; A. their discretion, and refrain from directives which 
Impede direct discussion. 

can exoect^tf tje recommendation is^lmplei Competent evaluators 

SLrSs-rSformatf'^ when they have the opportunity to discuss 

u^Jt's initSfl^f J " f ^ frequently. Restrictions on the evaluation 
ZtlJ -""^V^8 discussion with Congressional staff of Commlttee« that 
de^nd evaluation prevent the job from being done better. We reco^ni^e 
anftSr ""'^^ °" bureaucratic lobbying for programs are warranted 

lit lack orclefr1''"f T '"'^^ "^^^^^^ t° ^^^P P^clss 'd^riy, 
the IlkeuLni °PP°""""y to figure out what Congress can use decreases 
the U^lliiood th^t f^«l^"°«« be timely, relevant, and credible. Ind 

^estr^tloirwiJ^L^ % ^^'^ "^""s useful. Relaxing 

rescriccions will not of course guarantee usefulness. 

Plannln B and Executing Evaluations 

. nasotlate agreraent about when particular evaluations are 
warranted and the senses In which each evalitatlon required 
oy law is possible. 

. clarify Congressional information needs, quality of evidence 
„T planning cycle for each major evaluation 

undertaken by the Department. 

. identify specific audiences or groups for evaluation results. 

* oecn^"^ changes in program or understanding which could 
occur on the basis of evaluation, results. 

The rationale for this recoraiendatlon is identical to the one offered 
lnfo™tfoi" TT'"'""'" ""^^ " understanding c'gr^fsional 

tecSnSf evaluatlo' T^''^^ "^ular discufslon\etween 

tecnnicaJ. evaluation staff and Congressional staff. Scarcity of evaluation 

Sth^niIS:i '""^^'^""^8 that planning cannl"L°Llorm2"°" 
wicnouc aiaiogue among relevant staff. 
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Tests of New ProRram Co mponents. Program Varlatlong, and New Programs 

We recommend that the Department authorize eKplicitly the use of 
high quality evaluation designs, especially randomized experimants. In 
evaluating new program componaita, progrm variations, and new 
programs, In all regulations which require eitlnating the effects of in- 
novative changes. 

The main Justification Is that high quality designs lead to far less 
debatable estimates of effects of programs on children than low quality designs. 
They are more difficult to execute, and they are more feasible for pilot testing 
new programs, program variations, and program components, than for estimating 
the effects of ongoing prograns. Explicit authorization would make the 
Importance of good designs plain, and would provide mora clear opportunity 
for competent SEAs and LEAs to exploit them. 

Critique an d Secondary Malysls of Evaiuation Results 
We recommend that the Department i 

• provide for the independent, balanced, and competent 
critique of every major evaluation funded by the 
Department in procurement of evaluations and 
evaluation policy, 

, incorporate tot© procurment procedures and policy ^ 
the requirCTient that all statistical data produced 
in major program evaluations be documaited and stored 
for secondary analysis. 

* create an administrative mechanism for deciding when 
simultaneous analysis by both the orlgtoal evaluator 
and an independent analyst is desirable and feasible, 
and a mechanism for executing simultaneous Independent 
analyses* 

The rationale for this reconmendation is identical to the one offered 
for a similar reconunendatlon to Congress. 



Access to and Specification of Reports 

We recommend that the Department adopt a policy toi 

. adhere to a clearance rule which makes evaluation reports 
available after a specified period of time, 

. specify COTpletely the evaluation documents referred to 
in the Department's Aimual Valuation Report , the Federal 
Register , and policy statmmts. ~~ 
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. Include, in every inajor evaluacion report, a list 
of core recipiants of the report, or compiling 
publicly available lists of core recipients. 

The recommendaCion stems partly from difficulties encountered in 
obtaining reports under review by the Executive Secretariat and other 
groups involved in the DHEW clearance process. We also found it difficult 
to identify reports preGlsely, when they were cited as evidence of the 
usafulness of evaluation in developtag regulations or policy. The absence 
of a list of core recipients of reports makes it very difficult to identify 
potHitial user groups and to determine if reports were used. The conse- 
quTOce is that what is uselesi or useful is less verifiable. 

The Use of Evaluation Results 

We reeomend that the Department direct evaluation unit staff or 
evaluation contractors toi 

. provide oral reports regularly as well as written reports 
on results ©f major valuations , and on the uses to which 
results can be put, to relevant Congressional staff and 
support agency staff ^ and the program staff within the 
Department* 

. create a syst^ to periodically collect, synthesize, 
and report specific usea to which evaluations are put, 

, improve the Annual Valuation Report by citing instances of 
use more specifically, 

. direct evaluation staff to meet regularly with Congressional 
staff to clarify Information needs, feasibility of evaluation, 
audiences for results, and ways in which results can be used 
to modify programs. 

The recommendations are based partly on the finding that use of 
evaluation results Is not tracked conscientiously and the belief that 
it ought to be tracked to learn how to do evaluations better, and how 
to better allocate evaluation reiources. The rationale for the last 
recommendation is identical to the one given earlier on planning and 
executing evaluations. 

Jtpplanentat ion 

We recoranend that the Department i 

. routinely require formal measurement of the degree to 
which progran plans match actual operations. 

, adjoin research on methods of measuring taplementation 
to the introduction of new programs and program variations, 

, create an toe^ensive central infornation system on the 
, time and resources required for full taplmentatlon of new 
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The main reason for the firsC raconmendatlon is simply that measure- 
ment of implmentation of innovations is infrequent* The reason for the 
second recomiendation is that ve know little about cheap effective methods 
of measurement in this arena* The third recommendation stems from the 
absence of any reasonable empirical guidelines on the time and resources 
necessary to implMent Innovative programs* 
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7.3 RATIONALE 



The reconnendations are based partly on the Project's finding- and on 
Judgments about What needs to be done to Improve evaluation p^ac^i^e We 
^wr>%f '/"^""'"^ from soma Congressional and'Igency "taff 
^ S aLlL'et^r I't? °f Sciences Co„.lttee on ProgrL E^aStL, 
- 1^ and &EA staff. But tune did not permit any systematic critique. 

aorf ^««"iP"on of the rationale Is divided into broad topical cate- 

gories. The pertmenc sections of the preceding material are noted? 

Planning and Exacuting Evaluations 

of pot^tLl1irt1cLant\'''TH"r'""'""" complicated by the large number 
Off?ce^flv«lf!^^ ^ J . Congress, the Department of Education's 
uttice of Evaluation, Che Congressional Budget Office and the TAn tL 

^SSf^^t-^^^^^ 

iiin^taifr"^' st^JL. 

aary i^rgve'mat^ rl SciudL^"?"? '"' P ^^^^^ ^PP«" to be neces- 

Of the DepI^?mLt ^nfthf ^ir??' t l '^^^ »"tlngs among evaluation staff 

^ e uepaitment and the pertinent Congressional Comnittees. (b) a nlannine 
systan which matches evaluations to authorization cycles fc) l^Lr J^f ^ 
systans which make access to previous work slmoler and faster ^nd ^ nr^ 
flcation Of groups which can contribute to technlcalluflltrof "L^eLort? 

affortlo'^t^f^f" there has been a renewed 

We u^derstSd Production of evaluations to the reauthorization cycle. 

We underst^d from manos and recent activity of the Office of Assistant 

to doM'^^"^"*^' ^""^ be sustamad. It Inoperative 

Meetings . There is no system of regular meetings among technical staff 

whom answers ought to be addressed. We recomlze that d^^^ 4 „ C 5 
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^art from thase fundamental Issues ^ such iieetlngs might address 
chronic problass- Short tmrm Information as well ag long term information 
is often of interest to some audimcas for results and because each type 



to ba made emphatically* Because every major evaluation must be tailored j 
the level of fleKibllityj what is know and what is not toioTO, ought to be 
made rMsonably clear. 

We do not mean to imply that lockstep series of discussion among all 
relevant staff is warranted or possible* The point is that the absence 
of regular meettags on Congressional needs virtually guarantees that some 
needs will not be met. That to turn Invites buck-passing and evaluations 
of lower utility* 

Relevant Groups , The groups which should be involved In the process 
include evaluation staff from the DepartmOTt of Education's Office of the 
Assistant Secretary for ManagCTient and from the pertinmt Congressional 
Comaittees* It is STOSible to capitalize routinely on support agmcieSs 
such as the GAO's new Institute for Progrmn Evaluation and ralevant divisions 
of the tongressional Budget Office and the ^ngresaional Research Service, 
We do not mean to imply that all agencies need be represented always in 
lockstep meetings* 

Interest Groups and their Role , Interest groups that draft bills which 
create or modify programs should be urged to provide plans for evaluation of 
the effects of the proposals. These plans should be routinely reviewed by 
the Department's evaluation mlt if not by a group which includes Congressional 
staffers and unit staff, 

Impedtoen t s , There are imp^toents to any meetings of this sortj of 
course. On the agency side^ for instance^ staff have maintained that they 
have not been free to initiate conversations which would clarify intent of 
a demand to evaluate, on account of executive policy that restricts dis- 
cussion. The restrictions are said to have a variety of legitimate origins 
including preventing agency staff from lobbying directly and independently 
for pet programs p and to assure that there is at least some orderliness 
in dealing with the Congress* 

For evaluation by units with authority to ^aluate^ such restrictions 
are misdirected and inappropriate . No one can conscientiously address a 
question posed by Congress if the question cannot be discussed directly* 
We believe that agency policy must recognize the relative independence and 
discretion of evaluation units . 

Impediments on the Congressional staff side appear to be less admin- 
istrative than physical. To be sure, there are staffers who will partlci^ 
pate in no discussion imless directed by a tonmittee chairman to do so. 
But they appear to be Iji the minority. The more gmeral problem is, we 
are cold, ttoe~the sheer difficulty of coordljiating meetings so as to be 
reasonably convent lent to both agency staff and Congressional staff. We have 
not had the time to exmlne the validity of this complaint. But we find it 
difficult to believe that it is insurmountable. 
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to t Jw r ""^^"^ ^"""^ episodic reluctance among Congressional staffers 

to talk to contractors. We do not Imow how serious the problem is It 
does seen sensible to exploit agency staffers as much as possible ^hen 
Je^f iblf f ""h contractors. It sems equally 

^e«i^J^ ^ t ingenuous mistrust of cnntractors does not topade 

understanding of how the evaluation ought to be conducted. 

actiof bv'^hf sufficiently formidable to justify some joint 

lSt?f?^ ^? Congress and the Departmait. That action should Involve 
o^S!if , /^'^"'^"''^ methods of ameliorating the problem and 

perhaps narrowly focused tests of their feasibtlty 



Statutor-y Provisions for Evaluat 



ion 



to "eva?uftl«n"% ' specific about program raportlng, references 

or to evaluat^ ""^f J!* ""^^^ The simple requirement to evaluate 

of the f ^effectiveness of the program In meeting the objectives 

ot the statute Is frequent. 

however, enormous variety in the way individuals at the 
in W ^^-^ °* government Interpret the word evaluation 

IaaIT. Z f^^^^; " concerns the array of questions which might be 
addressed in an evaluation, in the approaches one might choose to answer 
them, and the level of detail at which they might be answered. 

"°^e specific statemait of the questions which need to be addressed 
J confusion and ambiguity in what is Intended by law 

Sfof^tM of scope and probable costs and benefits of the 

information. See Chapters 2 and 3 for details. 

Specification. If the Congress needs to Imow: 

. how many are served and how many are in need, 
. what are services and their costs, 

. what are the effects of programs on their primary or 

secondary clients, 
. what are the costs and benefits of alternatives, 

the Congress should request that Information explicitly. That It Is feasible 
to be more specif ic is clear from the statutes mandating the NIE Compensatory 
Mucation Study. That specification is not always sufficient Is cleir Irom 
the same study: intensive discussion was needed to clarify evaluation goals. 

The same discipline ought to be asked of Interest groups, advisors 
to a«k ^valuation language for programs. It L senslUe * 

to ask that the questions be specified along with other features of the 
prograai. 
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Whm Specification Is Not Possible . It will not be possible or desirable 
to be explicit in every case- To assure that general dmands for evaluation 
are not misinterpreted , the law should provide for a formal assessment of 
the senses in which the program can be evaluated within one year after the 
enactment of the legislatlon- 

Regardless of Specification , Regardless of how specific requirements 
in la%r are^ there is a persistent need for regular dialogue between agency 
staff and Congressional staff in refining questions and developing agreements 
on what level of quality of evidence is warranted, and at what cost. The 

dialogue has occasionally been encouraged in Congressional Cofflmittee Reports , by 
some Congressional staff and by some agency staff. But it is irregular 
and more heavily dependent on individual preferences than it should be. 
It Is also a demanding process. 

Audlmces . Because evaluation results may be directed to any niimber 
of audlOTcei~the CongresSj Departmmt management , interest groups, advisory 
conmittees, and so on, there is a clear need for focus* The more audiences 
there are, the more comply evaluations become. 

Pilot Tests of Evaliiatlon Demands * Where there Is substantial dis-- 
agre^ent about which questions should be addressed and about how the 
infomatlon might be used^ pilot evaluations should be undertaken* That is^ 
one momts formal small scale es^erjjnents to determtoe which of several 
different evaluation schmes work best* They can be put into the field (a) 
to determine paperwork burden on respondents, (b) to determine costs of 
collecting the inforaation, (c) to determJjie the quality and usefulness of 
the information^ and (d) to clarify language which can be \ in statute 
and regulation* 

Use and Authority for Better Evaluation Designs 

The authority to use better designs ^ especially randomised eKperlments 
in the interest of relatively unequivocal evaluations of new programs, new 
program variations, and new program components must be made explicit in law 
and regulation. For details, see Chapter 5* 

By "randomized experiment" here, we mean assigning children, schools* 
or classrooms randomly to each variation, for Instance, and then observing their 
performance under each regimen. The random assignment Is a key feature* It 
guarantees that, in the long run, comparison of the variations will be 
fair. This is one of the reasons the design has been used in the Negative 
Income Tax EKpertoents^ £n the tfanhattan Bail Bond experiments, in evaluation 
of T*V. progrmni such as Sesame Street , the Electric Company , and Free Style ^ 
as well as in the evaluation of the effectiveness of medical tr^tments* 

^e rationale for the first part of the recommendation * for pilot teats 
of new programs is that higher quality evaluations are much more feasible 
before the program la adopted at the national level. Better evaluation designs 
can be employed, conclusions are less likely to be ambiguous, and political- 
institutional constraints are less likely to be severe. The introduction 
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of new programs can be staged so that earlier stages constitute pilot 
tests for the later ones. This may seem terribly mundane to some readers. 
But recognize that in current political discussion of the proposed Youth 
incentives Program, for Instance, an enterprise whose probable coats will 
exceed $850 million per year, there has been no formal attention to pilot 
testing or staged introduction of the program. Title I compensatory 
education programs evolved In the same way ten years ago. and we still 
know pathetically little about effective variations. The simple notion 
cnat massive new programs ought to be pilot tested is warranted. 

The second part of the racomfflendation , concerning higher quality 
aiuation designs, is baaed on the presumption that we won't learn how 
Co bring about clear detectable changes in the performance of children or 
schools without more conscientiously designed tests. The justification 
Ihl ^J*J^=o°««ndation lies partly in the poor quality of designs used in 
t";J , It is discouragingly easy to find, for example. Congressional 
llTltTl- u » "-^^ ^ P'^S""" i« declared to be a success by a state 

esfJitfL " ^^"^^ ^° advocate attempting to 

estimate program effects in all cases. The process of estlir^tlng effects 

fluJsf " °^ conditions. We advocate attention to high 

quality designs, especially randomized experiments. 

t%^^^ ^"""^^ ^^"^ SSm. ^valuators with the interest and 

w. hf^t to employ the design for the sake of fair tests. An obstacle, 
we oeiieve, is confusion about authority for running such tests. So for 
instance, an evaluator offered the opinion that the design is desirable. 

if "^Af^tl f ^^-^"^^ °* ^ =1^" mandate, could not risk employing 
it. At the federal level, we believe the authority exists. Indeed, 

Schn'^f ^"^'^ " conducted for the Emergency 

?aiw^"f f'/"i ^^""^ ^^°y^d state of the art experimental designs, 
ine taiiure of federal program managers to encourage randomized experiments 
at the local level is partly because the mandate to do so Is not explicit. 

^'f;ff • usefulness of randomized tests in principle is generally 
not at issue in discussions about evaluation of new education programs: 
There is agreenent that when experiments are conducted properly, orthodox 
theory guarantees that long run estimates of program effects will be 
unbiased. Argument about the uses of the design concerns the idea that 
randomized eKperiments are rarely feasible in field settings. Rareness 
and feasibility are, however, infrequently specified by government policy 
groups or by individual analysts. Rareness does not establish lack of 
feasibility and a notable if not large number of field tests have been 
mounted. Some recent illustrations were covered in Chapter 5. Judaina 
from precedent bald claims that it's impossible to assign individuals or 
schools or other units randomly to programs for the sake of fair estimates 
of program effects are unwarranted. It is imperfect evidence in that it 
doesn t guarantee that an experiment can be mounted successfully in the 
situation at hand. BJ.uj.j.y j.n cne 

Pilot Te sting the Experimental Design . We believe that pilot tests of 
experiments can yield more direct evidence on the feasibility of randomized 
experiments or other high quality designs. We recoimnend mounting a small 
assessment prior to the major field experiments to identify anticipated 
problems in the field and to resolve them. The main justification for 
considering such pilot tests is to work out problems beforehand. Randomised 
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experiments fail to be successfully Implemented In education as in medicine, 
economics, etc. bacause the randomization is incomplete^ because the programs 
are not implemented as advertised, and for other reasons. Pilot tests of ■ 
the experiment itself can help to avoid unnecessary flaws In implementation. 

General Criteria. Lacking dependable precedent and the opportunity for 
adequate pilot tests of the evaluation dasiin. two general criteria for 
Judging feasibility of randomized experiments seem sensible. 

The first criterion turns around the fundamental notion of equity. 
Where there is an oversupply of eligible recipients for a scarce resource- 
program services— then randomized assignment of children to the resource 
seems fair. So, for instance, Vancouver's crisis Intervention program for 
youthful status offenders affords equal opportunity to eligible recipients, 
bince all could not be accommodated and they are all equally eligible, they 
are randomly assigned. Experts such as Cook and Campbell argue that random- 
ized experiments are most likely to be carried out successfully when the 
boon, real or Imagined, is in short supply, and the demand for the boon Is 
high. This rationale dovetails neatly with normal managerial constraints. 

' programs cannot be emplaced all at once and all eligible 

candidates cannot be served at once. Experiments can then be designed to 
caplcalize on staged introduction of programs or services. 

A second criterion concerns settings in which it is politically un- 
acceptable to assign individuals randomly to control conditions despite 
the fact that we know absolutely nothing about whether a program works 
relative to no program at all. The ethical, moral, and economic Justl- 
ticatlon for experimenting may be quite irrelevant. In such instances, 
It is often possibla to ameliorate difficulties by comparing program 
variations against one another, rather than comparing a novel program to 
an existing one or to no program at all. A "No program" control condition 
nay be an unaeceptable political option whether the program falls or not. 
fhe most we can reasonably expect then is to choose the invented variation 
component which works best for the Investment. 

The idea of testing variations or components rather than testing a 
program against a control is a compromise. But we believe that getting 
no information at all on the main one— what are the effects of the program. 
And the idea is generallzable. In particular, for ongoing programs that 
have strong public support, it seems sensible to think in terms of randomized 
assignment to new program variations or randomized of new program components 
to discover more effective or cheaper versions of the program. This strategy 
has not been raployed by any major ongoing federal education program that 
we know of. Indeed, it's not comnon in any social area except the U.S. 
Census Bureau. In the latter, randomized field tests are periodically run 
to understand better methods of doing census and surveys. 

The moat direct action which Congress can take to ameliorate the problem 
involves any atatute which asks that the effects of a new program , new program 
variation , or new components on children be estimated. We recommend that such 
statutes include an explicit provision authorizing statistically valid random- 
ized iKperlfflents , 
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to fost L ta?rlLta ^ some explicit authority ^.e b.lleve Is neeassary 

compUance with -T Secretary should be empowered to: waive 

projects Wh ich are likely to ^a^ t to promotln B thp ^^4^.^ ^ ,.^, 

cn1l^"1 rLdo.i.eS tests of hl lp^r^e^rilt Jons 

on Title I programs, student loan programs, and the like. 

Independ ent Cr It Igue and Secondary Analyals 

K 1 ^_"^'^^tlque" here, we do not mean adverse com nentarv. We do mean 
balanced e^inatlon of 'the quality ot the report and juSents about 

here r"fers°^':far°^' T '""''^^^ evidence. ""s.conlary'analysls 

here refers to analjsls of raw statistical data, undertaken Co Improve on the 
quality of earlier analyses. For details, see Chapter 5. 

1?* oi^iiins of this recomendatlon lie partly In a principle that we 
should recognize good quality evidence as such, and to properly id ^"fy 

poor evld"'"' f "^^^ the LgLuous u" of"^ 

«l^!ir"^?" "^^^"8 unnecessarily on one's confidence ±1 the 

sSiiblfto Sl1w-thr"''™f evaluations are expensive. It see^s 
IZ aIII Lt the community of policymakers or their advisors to make 
the data work repeatedly, at low cost, in secondary analysis. Because 
evaluations may affect a variety of Interest groupl, those groups should 
be given an opportunity to offer competent criticism. Finally, we believe 
evIlStlons "^"^^'^ ""icism can degrade the l.portanc4 If good ' 



Lc^cl^^K 1 ^.'^"t^° for Critique a nd Secondary j^alysis; National 
iSXSi. The elmencs of an effective system for criti que ind secondarr " 
analysis include, (a) explicit institutional policy on rapid disclosure 
of reports and access to statistical data underlying the reports, (b) a 
formal mechanism for independent critique or secondary analysis where 
possible durtoa an evaluation, (c) a formal administrative mechanism for 
Independent critique and secondary analysis wh«i evaluation results are 
S^fom tl' ' ^"^"^^ guidelines on reporting and storage of statist inal 
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At the national level, elraants of policy on reanalysis have already 
received attantion, notably by NIE In supporting research and development 
on the topic. Tlie GAO has. In Its guidelines on impact evaluation, taken 
the position that access to evaluative data for reanalysis is generally 
an inportant consideration. The Department of Health, Education, and 
Welfare has not had a forMl policy on disclosure of statistical data. 
However, OED has had an unwritten policy and has released data periodically 
for independent review and secondary analysis. Critiques of particular data 
sets have been undertaken by the Congressional Budget Office as a part of 
Its efforts to screen studies for quality. These activities are undertaken 
so as to recognize individual privacy needs. Making policy formal, creating 
the administrative mechanisms, and testing thSE are sensible next steps. 
For details, see Chapter 5. 

The problm of rapid access to evaluation reports has been severe. 
Clearauce of OE evaluation reports by the Secretary, according to 
federal staff members who were interviewed, has bean slow at best. We 
understand that the new Department of Education has adopted the 10 day 
clearance rule which should improve matters. For details, see Chapter 5. 

Informed Criticism . Opinions about the desirability of early independent 
review of major evaluations and of secondary analysis are not uniform. At 
least some agency staff reckon that a routine process will generate more 
heat than light. Assuring cfflnpetent criticism in this arena is likely to 
be as difficult as it la In medicine, economics, and other fields. Outcome 
evaluations are always subject to criticism, especially if the program does 
not work. Some of that crlticiOT Is bound to be specious, dull wit ted 
and self-interested. High quality In design and ^ecution of evaluations 
offers some protection against unwarranted criticism, but it is unlikely 
to be sufficient. Adherence to a comon set of standards for technical 
quality also facilitates balanced criticism. But again, this Is unlikely 
to be sufficient. We have not had the time in this investigation to examine 
prophylactic mechanisms. We believe, however, that openness to criticisms 
must be given priority, and that some administrative research on reducing 
mindless criticism should be undertaken. 

State and Local Level . A good many local and state "evaluations" 
provide no more than counts of those served, changes in test scores, and 
similar information. It is not clear that regular, systematic reanalyses 
of the raw data underlying these is warranted, it Is more clear that samples 
of reports ought to be critiqued periodically. For details of the content 
of these reports see Chapter 3. Discussion of their quality appears In Chapter 5 
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ERIC 



tests of Innovative proarams fall " f Programs supporting 

.ittea as p.. ™- i^^^" 

that t^B ™Sitr^f ' independent, competent criticise la to assur. 

also ^^^o^:^^^ ^rT' ^« 

the exercise, 1^ the long r^n. ^ '° ^^^^^^y of 

revieSf^'sSL: ilthlLrl ^"f^f ^ ^^^i-l^^ available to conduct 
option. States aS fff^'Lri^^ developed evaluation units are a natural 



Tit 

va] 



good programs of all umATT assuring that information about 

however.\fLdLSde^rfl t ^o LMs. Such units nay not. 

gation is waSantS^ defi'-""^/-"?^"- soma field tovestl- 

to generatrSah aua^^^f ? " evaluation capabilities are sufficient 
see'chapter 4'" ' ""i^ue. For details on evaluation capabilities 

avaluSio^s^'^S ^ noSlSiJed T'^^^^^" "^^^^ ^f samples of 

by LEAS and other a«nciL^hl i ? «*^ini«S evidence volunteered 

frank criticism Sfso itfml^ believe It is strong enough to sustain 
n^ber of revSwerf I-Lm S^'^"" expanded. lUe 

a small additional s^^ a^l Lf f ' ^""^^i^-^^ ^o review even 

na± sample and so its co^lement would have to be enlarged. 

anothef ^^j", ^'luf theL'rol"? "'^^^f "'^ "^'^ ^"^^ ^ constitute 
asked aboSt T^^le f progrLs It i""'f 1 P'-"^S advice only when 
as independent revLwerf ^'iv clear that TACs can be regarded 

atlons in the first IJstanc^ p. ^r'* '^^^ ^^^^^^ ^^^1"- 

capabilities to warrant ''^ ^''^^^^ ^^out their 

gation. expanding their mission without further lnvestl= 

reports\f:uflSi:^,"^^:^:|,f^^ '^^^ -f reanalysis of evaluation 

' plans and administrative vehicles for 

critique and reanalysis, 

. alternative sample designs and time frames, 

. design of pilot tests for review so as to estimate 

costs and benefits of a system before it is^p^acad 

to determine if the effort is Indeed Justified! 
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Access to Raports 

Effective mechanisms to assure early ralease of avaluation raports 
and ready access to reports ought to be created. 

The origins of this recoimendation lie partly in the Idea that evalu- 
ation reports offered as a basis for policy 5 major executive decisions^ 
and oversight ihould be open to competent criticism and should be accessible 
to a wide variety of potential users. It stms partly from the dJJficulty 
encoimtered in obtatatog reports at the federal level, though this diffi- 
culty is far less severe than problems at other levels of government* 

^pid access to reports has over the past few years been tapeded by 
clearance processes withto the Education Division, That is, reports issued 
by a contractor have been reviewed by the toecutlve Secretariat before 
release and those reviews have resulted in delays in release without notable 
improveaaent in the docimrats thCTselves. 

The tocluslon of a clause in contracts , requiring that permission be 
sought prior to even discussing an evaluation, is more invidious. It 
prevents some universities from bidding on evaluations, since the clause 
runs counter to university standards of intellective Indep^dence* It is 
possible that th^ proviso reduces the quality of reports by Impeding dis- 
cussion of projects in professional formns. The mechanical difficulties 
of identifytog and obtaining a report or a cluster of reports bearing on 
a specific evaluation are very tedious. For details see Chapter 5, 

ClMrance of Reports . The probl^ of assuring rapid access has been 
rectified at least in the sense that the Office of the Deputy Assistant 
Secretary for &?aluation and ftonagement has established a new clearance 
process. Reports are to be released automatically after 10 days if the 
Secretary IctsI review has no modifications* The mmorandum also permits 
adjoining criticism to the released docimOTt by program managers. 

We believe that automatic clearance after a specified period la 
desirable. We recommend that the practice be maintained regardless of the 
controversy surrounding a particular report. 

The practice of requiring contractors to seek permission for discussing 
results Jji public forums has not been ^amlnmd or resolved. Our recommen- 
dation is that no such requirem^t be taposed in contracts. 

Distribution of Information , We suggest the creation of a Department- 
wide periodical xAich Idmtifies and abstracts each evaluation report 
submitted to the Departaent and submitted by the Secretary to the Congress* 
We OTpect this to Mieliorate access probl^s inside and outside the govern-^ 
ment. At its bests such a periodical will keep the public ^ the Congress, and 
staff of the Department abreast of what has been produced and perhaps even 
why it was produced. Models for this include GAO's Montlily Reports s which 
sunmarlzee reports issued by the agMcy. 
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baitV ;r^h for Distribution . The practice of assigning sole responsl= 

bUxty to the project officer for final reports is not entirely effective 
L dlstrlH^ attention to circuUttag reports and suLitti^g th;m 

rL? 5""'^°" ^"^^ ERIC. More taportant, they shift agelcies 

iL St at tLL^'^i;' f""'"' disappear. L do reports at 

oSlcer! ^roptlSs'^ncJud^r °" 

(1) Strengthening intertial agency capability for storage of 
reports. * 

(2) Assuring that the list of core rectp ig nts for reports 

are included In the reports themselves, or chat such a list 
is publicly available. 

(3) Requiring the contractor and the agency to maintain 
a list of reports, with full citations, generated 
together with the location of the agency which 
disseminates it, 

(4) Requiring that the recipient of each evaluation 
executed under contract or grant provide abstracts of 
reports, reports, or both after 10 day clearance, to 
tRIC, NTIS, the pertinent education centers and 
laboratories, CEIS, FEDAC. Congressional staff and 
support agencies, especially CRS and the GAO 

(5) Distribution of each report routinely to every federal 
evaluation project officer and every evaluation contractor. 

Tracking The Use o f Evaluations 

Our attention to this topic stems partly from the arguments we en- 
countered about wheth« evaluations are used. The questiln is not whether 

^s dete™?nln^ h'^'^h"" '""^^ interesting question 

balfn^e^^ ! ^^^^^ "^ed, and how to 

ade^^^^^ T '^^^ 1"*^ question cannot be answered 

adequately now because the hard answers to the "how" and "how often" 
questions are fragmentary, and the soft answers are rather too dependent 
on flawed memory and competing interests. See Chapter 6 for details. 

^t, . ?'^^^f ^" problem of verifying use or nonuse 

JSitlatlni ""^f % f one. Turnover of staff responsible for 
co«obo«t^nf " f and using evaluation Is sufficiently high that 
corroborating use of an evaluation through independent sources is difficult 
and somettoes Impossible. Titles of reports often imply nothing about 
potential or actual use. The reports are mlsremmbered or forgotten 
Incomplete citation is a chronic problCTi, 8 tten. 

The following reconmendat ions are mundane but critical for inexpensive 
tracking. At best, th^ will eliminate part of the burden placed oJ'^re- 
spondents In studies of the use of evaluations. 
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Be tter Specification in Reports. Regulations, etc . Failure to specify 
reports, in both Congressional reports, agency annual reports, regulations, 
and the like is not prudent. If (^ngressional Reports, agency annual reports, 
and Che like are to be as useful as possible to the community of thoughtful 
readers, then references to evaluation should include (a) author, (b) Litle 
ot the report, (c) date of issuance, and (d) sponsoring agency. 



If this is not possible, thai Mrely hiring an Inexpensive, bright 
graduate student to build a specific refermce list for each report of 
the half dozen or so Comilttees most pertlnmt to educational evaluation 
would suffice, so long as access to the list and dissattination of the 
list was assured. 

The recommmdation applies to both Congresaional toignittee Reports 
and to major agency docimmts such as OE^ s "Mnual Ivaluat ion Report and 
policy statements. CBO docummts are somewhat mor^ conscientious , and GAO 
dociMeits normally carry at least part of the ^formation suggested. Because 
we do not have access to CRS docus^ts, we c^inot make a JudgTCnt, It appllei 
also to proposed and final regulations issued In the Federal Register stace 
evaluations do result in regulation changes but are rarely recognised com- 
pletely in the prose describing changes, M illustration of exOTplary 
practice for this last is the recent modification of recent regulations on 
day care. 

The practice of recognising evaluations explicitly whm they have 
been useful in deliberations of Congress and at the executive level is 
aitalrable. It ought to be continued for three reasons i 

, it Identifies what is useful, so guiding the agencies 
in the long run if not the short, 

, it rewards those who perform well, 

* it exhibits some integrity to an occasionally cynical audience. 

The practice of recognising good evaluations which are used is not uniform, 
however. The sponsoring agency is not given credit and so forth on accomit 
of time and resource constraints. That problem la serious enough to dis- 
courage some staff even if it Is not sufficient to dmoralize them. 

More conscimtlous attention to recognising useful evaluations and 
more conscientious attention to recognizing useless evaluations in committee 
reports and the like would help. 

Improvements of OB*i tonual Evaluation Report- The Annual Evaluation 
^^P^^^ enumerates uses of evaluatione completed by OE, is important, and 
ought to be the best possible. Staple options for Improvement includei 
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uses are 



. The report on usa should provide specific citation 
of each evaluation report, its author, title, date 
of Issue and issuing agency. Otherwise, it's impoaal- 
ble for the reader to verify that a report has been 
issued much less that it has been used. 

. Ihe report on use should provide specific citation of 
hearings or Congressional reports in which an eval- 
ITtZ rff ^ is fflentloned or used and specific citation 
Of regulations which are said to have been changed 
on the basis of evaluation results. It should cite 
regulations which are proposed or created as a result 
f JJ* »^al'""on. Otherwise^verlfylng claims of use 
is difficult or impossible. 

' "ntributors to the section on use of evaluations 
Should be acknowledged to permit verification and 
eorrobora t Ion • 

. Thm- Ainual &mluatlon Report 's perspectlva on use 
ought to be rawsmiined to identify flaws In indicators 
of use, such as citations of hearings, and the possible 
biases in them, ignoring agencies apart from Congress 
makes it likely that use of evaluation results is 
understated. Very little information on management 
uses, apart from regulations, is provided. 

Evaluations for which it is difficult to find veri- 
fiable evidence on their use should be identified 
Evaluations which are virtually useless two years' 
after production should be identified tatplicltly. 

--^'SoSd't^? -i addressed in future examination of 

Would reporting other than annually make sense? 



, I dentify tog the Recipient s of Reports . Ifajor evaluation reports should 
have appended to th^ a list of the Individuals to whom the report was sent 
and their affiliations. This will facilitate tracking use steply by making 
potential users or audiences clear and it ^11 facilitate our unlerstandinl 
of misdirected effort. The practice of appending reader lists to reports is 
current at the Office of Naval Research. The prictice does appear to"e 
feasible for at least major evaluation reports. 

Where enumeration of members of ths audience la not feasible, then 
the lists cojTOonly generated Internally and used as a basis for sending 
reports to Individuals ought to be accessible. The existence of such 
lists, their title, and source should be identified in major avaluation 
reports. 
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Tracking Manag^giu Changes # Very little syatmatlc, publicly available 
evid^ce is available on the nature dnd frequmcy of managerial uses of eval- 
uation« Moreoverp there is no gmeral mechaiiism for regularly following up 
on whether probleffls identified in an evaluation have bem rectified* Follow-- 
up does occur episodically^ through questions addressed to managers at Com^ 
mittee Hearings for instance. But we have been unable to Idoatlfy any special^ 
orderly record-keeping on the matter. 

We recQtmend that a simple examljiatlon of alternative mechanisms be under 
taken to determine If a cheap follow-up syste© can be developed ^ and to deter- 
mine how such mechanisms can be field tested. 

Local and State . We Imve not investigated state uses of evaluations suf- 
ficiently to make recommendations on tracktag mechanisms at that level. How- 
ever p two features of some local and state efforts are worth considering by 
both federal and state agencies* Some states such as Massachusetts and Mich- 
gan require that in local reports to the state the uses to which evaluations 
are put be reported regularly. Those reports arep in principle^ a vehicle 
for tracking usej and occasionally synthesis. We do not Imow enough about 
the quality of reporting in this arena. But we believe It ought to be examined* 
And where some states are found to have developed especially efficimt ways to 
accomplish this task, the procedures ought to be made available to other state 
and federal agencies. The alternative to regular reporting Is a special sur- 
very undertaken to obtain periodically a better picture of uses than one 
could obtain in reports. At least one state^ California^ has tried this op- 
tion, and the results are Infomative. 

Standards and Guidelines . Current guidelines can be exploited In de- 
signing evaluations and ±n making crude Judgments about quality of an evalu- 
ation report. But they are not equally relevant to all types of evaluation^ 
and they are not appropriate for tocluslon to law or regulation. They should 
be recognized In policy statsnents, Internal guidelines^ and other flexible 
directives. See Chapter 5 for details. 

Guidelines have been developed to guide design and to facilitate Judg- 
ments about evaluations. Most focus on planned efforts to assay program 
effectiveness, not on routJjia raportlng labelled "evaluation," 

The guideltoes are very gmeral as any set of guidelines on complete- 
ness and quality of evidrace must be, given the variety of foras which 
evaluation may take. It Is s^slble, for Instance, to expect that an evalu 
at Ion which purports to estimate a program's effects on chlldrm cover 
pertinent topics i evaluation design, source and quality of Inforaatlon, 
cimpeting explanations, and so on. These elements are part of most good 
guidelines. But they are, of course, no substitute for training and Judg*» 
ment. Moreover, the sensible Interpretation of guidelines requires some 
expertise. 
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Guideline have hmm developed by the UiS, General Accounting Office, 
the Evaluation Research Society » and the Independant Joint Comnittee on 
Standards for Educational Program Evaluation. Standards are embodied In 
raanuale used by the USOE-NIE Joint Dlsiemlnation Review? in assess- 

Ing educational wrth of new programs and the evidence sustaining Judge- 
ments about TOrth* There Is substantial overlap in topical coverage of 
all these. Jtoreovers the topical coverage overlaps. with standards used 
in choosing designs for major national evaluations and grants for evalua-- 
tlve work supported by NIE. 

The mato justification for recoimaendlng that guidelines be recognized' 
is that we believe they can be useful in clarifying ^at is mmmt by 
quality s in informing the public about ^rfiat can generally be expected of 
evaluation t Guldellnei may also be of some assistance in protecting the 
competent evaluation from gratuitous criticism* and in identifying the 
worst cases of inept evaluation, Flnallyp they can be useful in review- 
ing proposals made by LEAs for programs ^ich require special evalua-- 
tlon, such as bilingual education. 



National Level , We reconmend that guidelines be formally recognlEed as 
such by agency eKecutlves and by Congressional Committee staff. They have 
already been recognised by evaluation staffers within the education agencies 
and GAO; indeed^ agency and GAO staffers contributed to their development. 
By recognition herej we mean formal acknowledgment of the existence of 
guldelljies, some effort to assure that perttoent staff toow about thm, and 
some effort to test the guideltoes in the field. It would not be difficult 
to incorporate short reviews of guidelines In training programs and semdjiars 
on evaluation, run by the CRS, the GAO, or the Federal Executive Institute. 



State and Local Level , It is reasonable to assure that SEAs and LEAs 
taiow about development of guidelines, to make guidelines available, and to 
encourage tests of guideltoes at the local level. Guidelines can, for in- 
stance,, be cited in RFPs and grant material without dCTandijig ascription 
to them. They may be made available through special purpose Information 
clearinghouses, such as the one for bilingual education, and the general 
purpose ones, such as ERIC, 

It is reasonable to encourage their use, not require it, in the in- 
terest of fosterijig better quality evaluations and protecting competence. 
That encouragment can be given through federal and state agency offices 
which disburse funds for innovative programs. 

Responsibility for advising the public, administrators, school boards, 
and the like currently rests with evaluation staff at local and state levels 
It is not unreasonable to urge that they make guidelines available to these 
audiences for evaluation results. The guidelines are pertinent, however, 
to the minority of LEAs which do more than simple monitoring. 
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Field Tests , We do not reeoimeiid incorporating guldeltoes into law or 
ragulatlon* Only soma aspects of guidelines have been field tested, And 
regardless of how reasonable they appear to be in principle^ their costs 
md bmefits need to be better established before they are generally re= 
quired. Moreover ^ it is amsible to detemine their susceptibility to 
iJicompetent interpretation , misinterpretation, and corruption. Finally, 
guidelines will change a bit as the state of the art evaluation develops. 
And formal tests may help to avoid prmaturely rigid posture on what con-- 
stitutes quality, 

toveatS p Contraporary guidelines cannot be simply applied to evaluation 
reports produced by LEAs response to frferal or state reporting require- 
mmts. In the first place^ reports differ appreciably in content depmdlng 
on audl^ce. Reports made to Parent Mvisory Coimittees in Title I progrMiSj 
for tostance, contain Information which differs in depth and in kind from In-- 
formation provided to states^ for Instance. Second^ requirOTents are minimal 
Axiy review of what is produced to fill requirements is not likely to be a 
useful target for guidelines staply because more elaborate reports may and do 
exist • 

Estimating the Effect of Programs 

The general eacpectation that all local, state, and federal education 
agencies will produce clear evidence on the effects of progrms should be 
abandoned. The raphasis should be placed on finding better variations on 
programs in LEAs and SEAs which have the resources to plan and execute 
fair field tests and on well designed federal tests. 

Measuring growth of children in Intellective achievonent, in personal 
development^ and other areas is often warranted. But the practice of 
attributing growth to a program on the basis of these data alone is not 
warranted stoply because there are so many competing eKplanations for 
growth or any change. Local evaluation designs rarely recognize competing 
^planatlons. See Chapter 5 for details. 

The dCTand for information about how much a program affects children 
must recognize that clearly toterpretable estimates depend on evaluation 
designs which accoimaodate competing explanations. Those designs are not 
often feasible in local sett togs. Technical assistance is no substitute 
for resources, interest, or those designs. Moreover, estimating effects 
at the local level often has lower priority than providing services which 
children and their parents want. 

The dCTiand for esttoates of effect on children induces a kind of 
benign hypocrisy Mong some staff ers^ administrators, and local contrac- 
tors responsible for programs and evaluations. An tocrease in test scores 
is treated as widmce that the progrm "works." The conscientious mrabers 
of each omap will admit that other explanations are possible — -nomal growth^ 
for Instance. But they will also admits and we agree^ that separating out 
the influmce of the progrMi from other influences is not possible without 
a great deal of managerial, legale and teclmical effort and may be impossible 
despite those efforts, ^e admission does not appear frequently in evalu^ 
at ion reports on Title 1 programs, vocational education^ and bilingual 
education. 
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Judging from our site visits, LEAs and SEAs are Interested In testing 
cheaper varieties of programs, program components, and the Ilka and some 
of these are capable of doing this well. It Is sensible to capitalize on 
that Interest and expertise il the avaluations of these are well designed. 
To the extent possible, contracts for doing so ought to be made available. 
Funds are available through Title IV-C and some NIE programs. They can' 
lead to better understanding of what works, what works more Inexpensively 
and to the dissmlnation of the products through the JDBP-NDN systm. The 
effort may have to be augmented with assistance from universities, private 
contractors, technical assistance centers or others. But thesa are not 
substitutes for in-house staff and for strong administrative support of 
fair tests from administrators and oversight groups. 

wan ^^^""io'^l interest in understanding effects of new programs, as 
The conduct T^^V ^'^i^^^' ^^^^s to be recognized-^d'relterated. 
Irl f«^M f °- "'^ P^Srams should be supported where they 

are reaslble and appropriate. 

The origins for this recommendation stm partly from the proaress 
made over the past ten years in mounting f l.ld'^tests of new prSgrS" 
program variations, and program components. There have been Imperfections 
and failures in these tests to be sure. The execution of good oScome 
studies IS exceedingly difficult. These problems should not be regarded 
tere^nr 1/^°" virtue of understLdlng effects. The puWlcl'f 
mSLl H f ^ education or In other areas such as 

vulner^blf ^as not been consistent. Planned tests are always 

vulnerable on this account as well as on account of their youth. 

livereJ^ T^^^T"^ "^""^ ^P""' " whom services are de- 

ntrtant' ^^-""^^P""^^ evaluation or Impl^entation studies, are also Im- 
IK,rtant. Judging by recent work, the «nphasis on this simple ^formation 

t^an bo^y"cfStr\''- "'f" I' ^"-"^ opportunity however t^'oSt^^ore 

aS who v^^t^ -"PP-^ nominal statment of where dollars go 

understood I^' the services. The character of services is often poorly* 
understood. Any such investigation will not help one understand whether services 
produce more notable effects than cheap c^etitors or no service at all, of 

At the national level, we believe it is appropriate for public leaders 
which Will facllitate-ru^Sg thS! " Provisions law 
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Facilitating Integrity 

i 

Evaluation often engenders concern among thoae ^ose program Is 
evaluated. And this In turn provokes the evaluator concerned If she or 
he Is under the supervision of the program manaier. Consequently, main- 
taining Integrity may be difficult. Evaluation does demand some forti- 
tude as well as technical and political expertise. 

following list of options was developed to understand how one 
might facilitate Integrity at federal, state, and local levels. We have 
capitalized on some eseperlence outside education. We have not had the 
tune to adequately eeplore each option. But we believe they are worth 
considering. 



Option. Posture at t he PoUc^y, >fana gement. and Overaleht Levels of Government 

There Is some argument for the view that administrato rs of new and 
Innovative projects should not be judged solely on the basis of the success 
iL hi r°fr "^^"^ -^^y responsible. Ifany educational projects 
til aoSr«V /^"*"r!; ^^^^^ ^^^^ ^« often if not always beyond 

stLfwJv ?f individual or Institution. It la important to under- 

W«H ^ ^ ^^°8rM managers and their staffs, then, should be 

whefhL^nnf °* evidence bearing on a program, regardless , of 

vle^^ iJ\ - * success. To b e effective , that 

^J^^^^ national, state, and local levels, l^ny 

atooW !*ll/"°8ni^e that It also runs against social norms i it Is 
fboui ^ advocate or oppose than it is to seek balanced infomation 
d^a? of tSf ^ *^^P*«^ generally, it will take a good 

StSj .fr* f routine, perhaps as long as it took for the United 
States to infuse Integrity routinely into U.S. censuses— about 100 years. 

Option; Design of Ev aluatlQns of New Pro^ramfl 

A^.il^ sometimes possible to acconmodate fear of evaluation through 

nihi l P^'^sram A by comparing it to no program at all. Rather one 
vaSat* T^".t ^"^^""^ A °* the program to variation B, where elch 
variation has identical objectives but they differ in cost approachr 
or other respects. "No program at all" is often not a politicluy ■ 
viable option If A falls. Indeed, it's prudent to complre A ^ 

believes In planning! If A and B are equally effective. 

^^^^^^^^^ f = " less'expeLlve, and ' 

recognlElng the high risk of any Innovative social program. 
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The difficulty with the option is that wa often lack the imagination 
or resources to invent E, And, of courie* It provldei no inforaation on 
effects of A relative to no program at all* 

Opt Ion I External Km^tmr 

Otie ray to assure that incompetent evaluations and competent evalua- 
tiona are properly labelled as such Is to subject completed evaluations 
to external reviews. The tactic Is consistent with the aims of the educa-- 
tion agencies I the U.S. General Accounting Offlcei and other agencies with 
an Interest In quality and standards of evidence. It Is consistent with 
the recent trend toward secondary analysla of program evaluation data, 
conducted by Independent academic Institutions. The latter option has 
been used by* among others, the U.S. Office of Educatloni the National 
Institute of Education, the Law Enforcsnent Assistance Administration^ 
and other agencies in the United States. A variant on the tactic has 
been tried by Individual researchers In Pakistan In reviewing evaluations 
sponsored by governmmt. But there, as in other developing countries, 
the matter receives no attention. 

^Is option cannot assure directly that evaluations done with Integ-- 
rCty will be rewarded. Gratuitous criticism emerges quickly. It should 
make it more likely, however, that the poor evaluations are recognlaed as 
such and are not rewarded. 



Option: Joint Diss^inatlon and Review Panel Approaches 

Consider a review board with clearly defined standards for aKaminlng 
the quality of evaluatloni, and which eKaintne quality In response to a 
request from the program Mnager, A main objective of this panel or board 
is to officially verify that evidence is good and the program, if effective, 
deserves to be disseminated* Furthar> such a seal of approval can become 
a device for obtaining more money for similar projects from an agency* 
Both official recognition and the opportunity to apply for dissemination 
funds are appreciated ^ we believe, by competent evaluators and progrmn 
develop era . 

Such a aystm has been operating with some success by OE and Nil. 
The joint panel reviews educational products, basing review solely on 
evidence %rtilch confonns to well articulated standards. The approval makes 
th^ eligible for specially budgeted money earmarked for eKpanslons dls-* 
s^lnation, and other purposes* The eligible programs compete for addi- 
tional funds but with less competition than nomal and more likelihood 
of success. 



Option: Recognizing Mislabelling and Deception 

BurMUcratlc detection of mislabelling and deception Is generally 
a high art form. It la not yet as well developed in the evaluation arena 
as it should be* To take the stoplest case, descriptive surveys, needs 
assessment surveys, studies of management and operations, and the like 
are labelled "Impact evaluations" or Implied to be sufficient for impact 
estlMtlon* The former are eaay by comparison to Impact studies, they 
differ In function and use, and they must not be confuaed. To the extent 
that they are, people who do a fine job on difficult enterprises may be 
inadvsrtratly hurt by people doing a gooi Job on easier tasks. 3*^4 
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^tlomi IfenitpriftE aftd lyalttatlng Use of Evaluatton 

One of tha major conc^ns regis tM#d by progrMi onagers in local and 
reglpnal education agenctea on U.S* Office of Education apongored progrTOs 
and ^sewhera concerns the uses of evaluation* Ivaluatlone are supposed 
to be used after all, if they are good. It is in the program manager's 
Interest to understand probable use, and moreover to protect against in- 
genuous usep 

Ingenuous use is possible of course. Many managers, legislative 
staffers p and so on have little understand tag .of the quality of evidence 
and so may namely rely on poor data, niat result may affeot those with 
good evidence negatively * Perhaps more Important , the quality of utili- 
zation will vary depending on the eKperienee of the user* It is simply 
not easy to capitallEe on evidence easily in the Interest of making deci- 
sions or setting policy* One My, for raampla, decide on the basis of 
early evaluation results that a project Is ineffective when long term 
results can show that the program Is effective, toe may decide that the 
project in site 1 is taeffecttve ^en the project could or would worH at 
site 2. And so on. 

Some mechanlMs must be Invented to encourage, assure, and monitor 
the high quality use of evaluation results. Several suggest themselves* 
Training might be useful* This may include case study approaches as in 
Harvard's MA programs* The problem can be ameliorated partly through 
invention of cheap monitoring devices for identifying instances of Infor- 
mation use, e^g*, periodic telephone calls or cables to agency staff to 
ask whether Information has been used, how It was used and who used It, 
and verification of the use. 



Option: iKpliclt Policy on Independence 

There is no substitute at the Mtlonal, state, or local levels for 
policy on relative independence of the valuator. 

That policy may Imrblve bureaucratic ijidepend^ce, notably eliminating 
clearance requlrraents for conversation or disclosure ;of reports to any 
group* It may involve administrative Independence, notably by assuring 
that the evaluator report to an individual other than the program manager* 

It may involve fiscal independence ^ notably by assuring that fimda 
earmarked for evaluation are channeled through the evaluation unit, by 
setting salaries for the unit Indepmdent of salaries for progrm opera- 
ting units, and other methods. 

It My involve political independence, for example, through the bi- 
partisan approval of diractor of evaluation ta the same spirit as appoint- 
ments are approved for Inspector General and COTptroller General, 
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Evaluation Capabilities 

oree J^d 11^^^ l^TT -° " suggesting that deoands for evaluation be 
preceded by capabilities assessment", particularly at the state and 
locals levels, are as follows: 

First, identifying who Is and who Is not an evaluator much less the 
appropriate competency level. Is often difficult. Dependtag^ tie pro- 
gram and the assigned tasks, program staff, evaluation unit staff foSSlde 
beca"e the'^flLf f "f" 'f^^ '^"^ -«l-tion «sponslbllitr Second, 
cS?lfica?ion S ti" ^ " institutions offer formal 

1 5 ^® considerable debate about tralnlna 

and graudate curricula vary in e:^hasls across institutions. 

LEAS anfsS^JJff ' 'J* '^^uisad of "evaluators" in 

^alllt ^ . 7' depending on evaluation activity. When evaluation 
do^ot r.'% ^f^^ reporting requirements, the skills d^nded 

is essential H ^^'^ ""'"^S- But some technical co^on sense 

is essential. However, when evaluation activities go beyond the mlnlmua 

muftfol - ^^^^1 ^'^'^ sophistication of required ^SS^ 

should L^^ '^- ^^^^ ^^^^ activities and the capability demands 
should receive separate consideration In law, regulation, and evaluation 

what kLdf assessment here we mean systanatlc attempt to describe 

what kinds of skills are required for what kinds of tasks. For national 

I^TfLJyLr'of f 'd'^'T"'^"' "^^-^ mtens ^ffSd JeLrch ^ 

task analyses - of good performers. But it need not be elaborate Observlna 

i 

^l"!!/!^! " ^ V aluation Renortfa^ 1.^,.-,....... It cannot be expected 

t^.df^ f f and local education agencies have the capabilities necessary 
progr^^ S rSLf ' 'valuation reporting requirements. OftS. 

aJafu^i™ agencies-indivlduals with responsibilities other than 

evaluation-assume responsibility for reporting activities. These persons 
were not necessarily hired for their evaluation e^ertise. ConSquently 
tectalcal assistance in evaluation should be provSed so that agencies can 
adequately fulfill federal evaluation reporting requirements! l^eJS 
technical assistance can be provided m a variety of ways- 



(1) 



At the minimum, the sponsoring agency should have direct 
access to evaluation unit staff with explicit responsibility 
tor training in evaluation. These individuals can develop 
appropriate guidelines for evaluation, arrange evaluation 
workshops for individuals who must complete these requirements 
and select the proper stragegy for providing technical assistance. 
Federal program agencies without these resources should consider 
creating specific job positions in evaluation. 
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(2) Adequate resources can be channeled to SEAs that admliilster 
these federal grant programs to permit thm to provide 
easily accessible and ej^ert technical assistance in 
evaluation, 

(3) Pederally supported and administered centers such as those 
exlsttog for Title I evaluation can be established to assist 
states and local education agaicles in meeting federal reporting 
requlrOTents. Onm appraoch is to expand services of the present 
TACs to Include provision of evaluation assistance for other 
federal programs. 

Technical assistance In evaluation Involves not only instruction and 
guidance in the actual conduct of evaluation, i.e., selection of program 
participants, the use of tests, and the completion of federal reporting forms. 
It also tovolves assistance in deciding who will evaluate. For example, 
districts that have capable evaluation units should be encouraged to use 
the services of the unit for all program evaJ uatlon ne^s. Small districts 
which do not have the resources to form their own research and evaluation unit 
may be Instructed toother options, e.g., the formation of a consortium to 
hire competent evaluation staff who serve more than one district. Regional 
assistance centers can be developed or aupi«ited to order to better provide 
technical assistance to evaluation. Vhm outside contractors are employed, 
guideltoes must be developed so that progrra staff and district selection 
boards can choose the most competent individuals, be sKisltlve to the types 
of skills required, and their rights in contractual arranganents. 

Gotog Bey ond Federal Evaluation Reporttog Regulraaaits . Meeting federal 
evaluation reporttog requlrments constitutes mtolmum acceptable performance 
for evaluation activities associated with federal programs, acpecting only 
mtoimum performance is short-sighted. Some districts and states often 
attempt to go beyond these requlrraients. Md, If competently executed, these 
activities can Improve the quality of Informaton submitted to the federal 

agencies. Congress, and to such other audiences as Parent Advisory Councils 

and school boards. 

We believe that provldtog more opportunities to those LEAs and SEAs 
with toterest and capabilities to evaluation Is warranted. At the state 
level, this can be accomplished through such existing mechanisms as the monies 
targeted for Improving state capabilities and state refinement grants for Title I 
evaluation supported by Stctlon 183(c). These funding mechanisms should be 
supported. Dissantoation of demonstrated Improvanents to evaluation practices 
developed by SEAs through such contracts should be promoted. 

The Improvement of local education agency capabilities deserves more 
attention than it has received to terms of discretionary evaluation activities. 
While some of this can be accomplished through an atpanded SEA role, other 
methods can be more specifically targeted at LlAs and supported directly from 
the federal government. Options include! 
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.^pandlng the proaram of direct contracts f o mna for 
^^t^-related aotjvltles can b. i..^^.,...^ tn..^. 
grants should allow LEAs to apply for and receive fuJds 
to aigaie m additional evaluation activities for federal 
prograas or rMearch on ways to Japrove evaluation methods. 

Mktos available grants to LEAs/SEAs to foster un iversity 
lfnd!n^°r°"f ^r/"" rtgrio^. This might includa 
ln«?J?^f training programs Jointly sponsored by academic 
Institutions and LEAs/SEAs. This would not only provide training 
frL?^°? P«8on„al but also Improve the quallty^of evfluatSn 
actSr^ «nf ersitles by allowing students to participated 
could bnu""'r|- ^ tmiverslty conducted workshop 

educftJn„ " °« continuing education for 

to^«iif ^f^F personnel. There may also be an opportunity 
mL^^ monies for SEA/LEA investment In iuch arrange- 

a perlol of\SfT^S*^''" ^^^^ "diversity faculty can spSd 
deSanlf. . J ^ agencies conducting evaluationa and 

designing procedures which will renato after their departure. 
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iglslatlve and Managmiant Background of the Project' 
Excorpts From Public Law 
95-561, Conference Report, 
Congressional Record, and USOE Work Statement. 



Section 1526. The Commissioner of Iducatlon shall conduct a study of 
evaluation p r actices and procedures at the n ational, state, and local ieval. 
itT T^l n? federally funded el ementary and seco ndary Educational pro- 
lore tSn o"" ""^'^f first annual renort to Congress submitted 

rllarrlt^ °^ enactment of this Act proposals and 

^5£Sg9«>datlons for the v^Uinn or modificati on of any parr;rWn?^..h 

p rovlsil^s-" P ^°^'^"^^^- Such proposals and recommenda tions shall Include 

CI) to ensure that evaluations are based on uniform methods and 
measur anent s ; — — — ^ 

(2) to ensure the Intearitv a nd indeDendenee of the evaluation 

process; and ' 



(3) to ensure appropriate follow-up on the evaluations that ar 
conducted. 



a 



EXCERPT: CONFERENCE REPORT 

32. Study of Evaluation 

t« nJ,^^ f ^^^^l amendment, requires the Commissioner 

to conduct a comprehensive study of evaluation practices and procedures at 
the national, state and local levels with respect to federally funded el" 
mentary and secondary programs, and to submit a report within 1 year. ' 

The Senate recedes with an amendment encouraging this study to be 
intensive on cartain problems instead of comprehensive. 
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EXCERPT I CONGRESSIONAL RECOSD 
HOUSE, H6671, 1978 



Amendment offered by Ms. Holtzman 

Ms. HOLTZMAN, Mr. Chairman, I offer an amendMent. 

The Clerk read as follows: 

toendment offered by Ms. HOLTZMAN: Page 375. Insert the following 
new section after line 25: 

COOTREHENSIVE STUDY OP EVALUATION 
Practices and Procedures 

"Sec. 1331. The Conmissioner of Education shall conduct a comprehensive 
study of evaluation practices and procadures at the national, state, and local 
levels with respect to federally-funded eltanentary and secondary educational 
programs and shall report to Congress within one year after the date of en- 
actment of this Act with proposals and recommendations for the revisions and 
modification of any part or all of such practices end procedures. Such pro- 
posals and recomnendations shall Include provisions- 

(1) to ensure that evaluations are based on uniform methods and measure- 

"'^"^^b) to ensure that integrity and IndependMice of the evaluation process; 

(3) to ensure aopropriate follow-up on the evaluations that are conducted." 
Redesignate the" following section and conform the table of contents accord- 

"^^"Ms. HOLTEMM. (during the reading). Mr. Chairman, I ask unanimous consent 
that the amendment be considered as read and printed in the RECOSD. 

The CHAIRMAN. Is there objection to the request of the gentlewoman from 

New York? 

There was no objection. 

(Ms. HOLTZMAN asked and was given permission to revise and extend her 

rmiarks.) . ™j ,-j 

Ms. HOLTZMAN. Mr. Chairman, 1 want to compllmant the Committee on Education 
and Labor and especially the distinguished and able Chairman of the committee 
(MR. PERKINS) for this very Important bill on elementary and secondary education. 

1 am offering a modest but, I think, necessary and useful amendment. It 
simply would require the Commissioner of Education to review the present pro- 
cedures for evaluation to see how they could be Improved and to report back to 
Congress with his suggestions and recommendations within 1 year. 

1 am concerned that we try to get the most out of out education dollars. 
This is imposslhle without an effective evaluation process which allows us to 
learn from our mistakes and build on our successes. 

At present, most federally funded programs In elanentary and secondary 
schools must be evaluated, but the evaluation process Is inadequate. There are 
no requirements for follow-up on evaluations-and many, mduding highly criti- 
cal ones, are filed away or forgotten. The absence of any uniform procedures 
or standards makes it difficult to compare different education programs and 
permits the use of inadequate evaluations at the local and state level. Another 
problan Is that many school districts select their own evaluators, who in order 
to obtain future contracts may be less than candid in their appraisals. 



APPENDIX 1 



To Improve education programs^ we need honest, thorough, effective 
evaluations which are followed up. 

My amendmmt calls on the Committee of Education to reconmend ways of 
improving the integrity and effectiveness of evaluations. Without such 
evaluations we cannot be sure that our education dollars are as well spent 
as they should be. 

Mr. PERKINS. Mr. Chairman, will the gentlewoman, Ms. HOLTZMM, yield 
to me? 

Ms. HOLT^N. I would be deligb<^ed to yield to the distiriguished 
gentleman from Kentucky. 

Mr, PERKINS, I thank the gentlawoman for yielding. 

Let me say that in ray judgment the gentlewoman* s amendment is entirely 
appropriate and fitting at this palce in the RECORD. It is something that 
should be required, appropriate evaluations, and we accept the amendment, 

Mr. QUIE, Mr, Chairman, will the gMtlewoman yield? 

Ms, HOLTZMAN, I yield to the gentleman from Minnesota, 

Mr, QUIE, I thank the gentlewoman for yielding. 

I believe that the l^year comprehmsive study of the evaluation practices 
would be very beneficial to this coimittee, and I would be happy to accept 
the amendment . 

Ms, HOLTZMAN, 1 thank the gentlemen for their comments. 
Mr, Chairman, I yield back the rCTiainder of my time. 

The CHAimiAN. The question is on the amendment offered by the gantle^ 
woman from New York. (Ms, HOLTZMAN) 
The amendment was agreed to. 
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ECCERPT: WORK STATE^ENT 
ISSUED BY THE U.S, OFFICE OF EDUCATION 
OFFICE OF EVALUATION & DISSEMINATION 
m ^SPONSE TO SECTION 1526, PUBLIC LAW 95^561 

1. ^^^vestlgator ^ la view of the self-evaluation aspect of the 
study. It should not be carried out by those persons or eroup^s ordinarily 
involved in the evaluation of Faderally funded education programs* Ihe 
services of a prestigious and e^erienced person with a national reputation 
In educational evaluation who is known for independence and impartiality 
should be obtained to oversee the study and to report directly to the 
Commissioner , 

2, The Questloris to Be Asked , The questions posed by Rep, Itoltsman's 
request should be expmded upon in a more detailed fashion and priority 
should be assigned to them , especially with regard to the specificity with 
which they will be addressed at each of the three levels of the study-— 
local, state and federal. This should be done in part by cqiiferring 
directly with Rep^ ttoltzmm * The three main questions encompass a number 
of allied questions that can be addressed* Scamples of some of these are 
given below ^ 

(a) Why and how are evaluations carried out ( methods and measurements ) 
(i) for whose information needs are evaluations carried out? 

(ii) what procedures are used in carrying them out? 

(iia) how "sound" are they (accurate, valid ^ reliable, 

etc j? 

(iii) what measurement techniques and devices are used and, how 
appropriate are they to the program's objectives? 

(Iv) are there conflicting requirements for different programs 
that lead to duplicative or unusually burdensome efforts? 

(b) \^at are the capabilities (iiicludJjig integrity and independence) 
of those who carry out evaluations? 

(i) where are evaluation activitiee located orgartizationally? 

(11) what is the background, traintog and experience of the 
staff and of those who carry out evaluations? If such 
services are provided by outside agencies ^ how are their 
services obtained md what Is the nature of their re-' 
lationshlp to to=house staff? 
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<c) how are the results of evaluations used? 

(i) what are the conditions that faciiitate or detract from 

their use? 

(ii) when results are used, what is the nature of the changes 
they lead to?; can exaraplary uses be identified?; is' 
their nature such that they can be adopted in other 
settings? 

(d) what legislative reiulatory. funding or other changes tnight be 
proposed as means to improving the nature, conduct and utility 
of evaluations? 

3. The_^Prog£ms_to_BeJ^OTlned. Clearly, not all federally funded 
elementary and secondary programs can or should be examined. Rather a 
set of programs that are numerous enough and sufficiently diverse enough 
U.g. state administered versus direct irant) to allow one to tease out 
all the issues involved yet are few enough in number to allow an intensive 
examination of certain problems (as suggested by the Senate in the conference 
report) should be Included. Examples of these are: Title I and VII, ESEA; 
ESAA; Handicapped; and Vocational Education (these are programs that have 
^-^''^^^^r " requirement; two are direct grant programs (ESAA and 

iitie VII) while the remainder are State administered). 

f • mc onduct of^he^ltudy. The investigator would prepare a detaiiid 
Work Statement and would acquire the services of an 8a contractor to pro- 
vide him with the administrative, logistical and personnel support needed to: 

(a) Review reports of those agencies involved in evaluation. 

(b) Interview persons involved in the initiation, conduct and 
utilization of evaluations, 

(c) Convene a panel of prestigious persons to review the findings 
from (a) and (b) and to make recommendations for the improvement 
of evaluation practices. ' 
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3*1 Introduction 

The main justification for this Project is Congressional interest 
in the character of federally supported program evaluations. In par- 
ticular, Section 1526 of Public Law 95^561 requires that "The Comilssioner 
of Education shall conduct a study of evaluation practices and procedures 
at the national, state, and local levels with respect to federally funded 
elementary and secondary educational programs.*," 

The study design was based on discussion among Congressional staff, 
federal agency personnel, and Project staff* The study was undertaken 
specifically to furnish appraisal independent of federal agencies and to 
examine evaluation at federal, state, and local levels of government* The 
Project is prospective in orientation, designed to provide- recommendations 
about evaluation policy and practice, the evidence to sustain reconmendat Ions, 
and the identification of issues and options. Specific questions to be 
addressed by the Project include- 

. Why and how are evaluations carried out? 

• Whr.c are the capabilities of those who carry out evaluations? 

• How arc* the results of evaluation used? 

• What r acoimnendations can be made to improve policy or practice? 

Answers to each question were obtained for federal, state, and local 
levels of administration, as requested by the Congress/ The main vehicles 
for providing answers to such questions are a critical examination of con^ 
temporary research on each topic, field work by Project staff, and roundtable 
discussions and formal presentations at professional meetings. 

Contemporary research that was reviewed by the Study staff included 
major evaluation studies conducted by HEW's Division of Education, special 
studies of the quality and uses of evaluation data at local, state, and 
federal level, and Congressional testimony and records on initiation con- 
duct and use of evaluations. The special studies have been supported by 
the National Institute of Education, U.S* Office of Education, and National 
Science Foundation. Samples of evaluations undertaken at the local school 
district level were also reviewed. This part of our investigation was ^ 
supplemented by interviews with the individuals responsible for production 
of the research. 



The second major source of liiformation was Interviews by Project staff 
directly from individuals with an interest in evaluation of educational 
programs. Three groups were involved. The first was local school district 
staff who were interviewed both in site visits and in a telephone survey. 
The second was state officials with responsibility in the area, again inter- 
viewed on site and through telephone surveys. The final group included 
federal agency staff and Congressional staff. Interviewed primarily throush 
site visits. . 
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The third major source of information was roundtable discussions 
conducted at Northwestern University, and discussion of Project activ 
at professional society meet tags. The topics for each roundtable dis 
cussion included utillEation, local school evaluation, and parental 
interests in evaluation, and others. 
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Appendix 3,2 

Recent Research on Evaluatijn^ 
Abstracts and Literature Review 

Project on Evaluations of Federally Fundad Education Programs relied 
heavily on earlier research by Northwestern and by other private and public 
institutions. The major studies on which we have relied are sunmiarizad below. 
The ones most relevant to evaluation capabilities and organization include i 

, UCLA Center for the Study of Evaluation, Survey 
of Large School District Evaluation Units 

- Bureau of Social Science Research^ Survey of Performers 
of Research and Research Related Activities 

* Northwestern Project on Evaluation of Evaluations 

. Hope Associates Performance Review of Technical 
Assistance Centers 

The major resources for information on use of evaluation results and of 
research more generally include studies at local , state, and federal levels. 
For the local level , these Include i 

. SRI International, Evaluation of the National Diffusion Network 

. SRI International, Study of Local Uses of Title I Evaluation 

. UCIA Center for the Study of Evaluation^ Survey of School 
District Evaluation Units (Lyons and others) and Case Studies 
(Alkln and others) 

, Rand Change Agent Study and Datta (NIE) Reanalysls 

. Huron Institutej Study of Local Uses of Evaluation 
(unavailable at this writing). 

The state level studies include i 

. SRI International Evaluations of the National Diffusion Network 
. Bissell Case (California) Study 

Federal level studies djicludei 

* Florlo (AERA) Survey of Congressional Staff 
, Milsap (NIE) Case Study 
, DHEW/ASPE (Wlioley) Survey 



EKLC 



379 



APPENDIX 3 



Studies of Capabilities and Organization 



The following material summarizes each of these studies* References 
to reportB are furnished at the end of this appendix. 

UCLiA Center for the Study of Evaluation's Survey of 
Large School District Evaluation Units 

Mring 1977-78i the University of California's Center for the Study 
of Evaluation eKamined the organiEation of local school district offices of 
evaluation. The Center is a regional laboratory , supported by the National 
Institute of Education, This study focused on a target population of school 
districts having an enrollment of 10s 000 or more students and an organizational 
unit for evaluation. Some 350 districts which met both of these criteria 
were surveyed and 72% responded* 

The study obtanjied data on the number of districts with evaluation 
units, the qualifications of unit dlrectorss si^e of staff and staff needs, 
activities falling under the rubric of evaluation, organizational character- 
istics of units j use of consultants, and funding. Details are given in a 
report Eyaluation and School Districts by Lyons Doschers McGranahans and 
Williams* The study was a major resource for Northwestern' s examination of 
definition, capabilitiess and organiMtion of evaluation at the local level. 

Bureau of Social Science Research's 

Survey of Performers of Research and Research Related Activities 

During 1976-78s the Bureau of Social Science Research (BSSR) , a private 
contracting firm in Washingtonj D.C*, undertook a study to identify and 
describe nonfederal organiEations that conduct research, development ^ dis= 
seminations and evluation (RDD&E) in education. Screening of over 6,000 
organizations resulted in a group of 2,434 organizations within Is 530 
institutions which provided information on capabilities- Screening was 
based on the organization's having been an active performer of educational 
RDD&E during the preceding years having a distinct organizational identity, 
and having appreciable autonomy in carrying out educational RDDSE. The 
largest subgroup of institutions ta the respondent group was the public 
education sectori 37 state education agencies^ 193 intermediate service 
agencies, 401 local education agencies , The academic sector, colleges and 
universities s included 423 institutions* There were 476 miscellaneous 
institutions such as private contractors, BSSR obtained information on 
funding and expenditure j of the organizatlonss staff size and education 
levels functional raphaseSs numbers durations and character of projects. 
Details are given in a report by Frankel^ Sharp, and Biderman. 

Northwestern University's Project on Evaluation 
of Evaluations 

The Northwestern field study has been dedicated to confirming some 
findings of the earlier studies , examining more closely arguable fadings, 
based on survey of school districtSs state education agencleSs and federal 
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agencies. This includes idantifylng staff background, training, major 
responsibilities^ use of outside contractors^ and difficulties in hiring 
competent staff and upgrading skills. It also Includes reanalysis of 
earlier studies by the Bureau of Social Science Research and UCLA's Center 
for the Study of Evaluation, and identifying issues which were not addressed 
in the earlier work. 



Hope Associates^ Inc.'s Performance Review of 
Technical Assistance Centers 

Ten Teclinical Assistance Centers (TAC) have been operating since 1976 
to provide aid to state and local education agencies involved in Title 1 
prograni evaluations. They were reviewed during 1978»79 by a panel of 
educational experts under contract to Hope Associates, with DHEW's Office 
of Assistant Secretary for Planning and Evaluation- The review Involved 
site visits to each TAG, to a dozen federal agency executives, and to 25 
state education agencies, telephone surveys of the remaining 25, and ex^ 
amlnation of pertinent printed material. Details are given in a report 
issued by Hope Associates. 



Studies of Utilization 



Stanford Reserach Institute's 

Evaluation of the National Diffusion Network 

The National Dif fusion Network (NDN) was established by USOE in 1974 
to foster diffusion and adoption of exemplary education programs. It 
was evaluated during 1975=76 to "understand the relative effectiveness*' 
of the system. The evaluation was done by SRI under contract with the 
USOE 's Office of Planning Budgeting and Evaluation. Evaluation involved 
analysis of pertinent source documents, mall survey of over 900 local 
education agencies, over 40 program developers, and over 60 facilitators, 
16 developers, and 36 adopters^ site visit interviews with 149 teachers^ 
and 30 prlncipais. The main goals of the evaluation were to provide com- 
prehensive description of the NDN process, to evaluate the organizational 
effectiveness of NDN, and to make reconmiendations based on the Inquiry. 
We are aware of no critique or secondary analysis. Details are given in a 
report by Emrlck and others. 



SRI Internationales Study of Local 
Uses of Title I Evaluations 

The study, supported by the Office of Assistant Secretary for Planning 
and Evaluation 5 was designed to determine whether local school district staff 
used Title I evaluation results to "identify strengths and weatoiesses of 
their programs in order to taprove them" and "whether the recent and proposed 
changes in Title I evaluation was likely to alter local use of evaluation" 



APPENDIX 3 



6 



(p* ill). The study focuaed on 15 Title 1 districts in three states, 
selected through referrals of recomiendationa by federal ataff and Technical 
Assistance Directors on tha basis of their beliefs that the districts were 
"especially concerned with evaluation.-* Site visits were undertaken to 
interview administrators , prtocipals^ teachdjig staff , and parents. Details 
are given in a 1978 report by Jane David. 

UCLA Center for the Study of Evaluation -s 

(Alkin, Daillak, and White) Case Studies in Compensatory Education 
and Innovative Projects 

Five in-depth case studies axamlned the uses of evaluations in projects 
for Title I compensatory education and innovations at the local level 
(Title IV-C of ESEA) , The studies described the cormaunity, school district 
and setting of the project^ the decision makers and evaluatorSs the nature 
of the project j the nature of the evaluation^ and uses made of the evaluation. 
In ail cases, at least some use was made of evaluation inf onnation* Dif- 
ferent audiences used evaluations differently. The evaluation frequently 
affected thinkljig and decisions jointly with other information. Details 
are given in a 1979 report by Alkin, Dalllak, and White. 

UCLA Center for the Study of Evaluation's 

(Resnickj O'Reillayj and Majchr::ak) Survey of School Districts 

Using the survey data collected in the Center for the Study of Evalu=- 
at ion's survey of evaluation units within school districts, these researchers 
concluded that evaluations geared toward resource allocation decisions were 
used more by the superintendent and the school board , while evaluations 
directed at developing or modifying currlcuium were used by the program 
administrator^ with a trend toward use by teachers and the superintendent. 
Details are presented In a report by Resnlck and others. 

UCLA-s Canter for the Study of Evaluation's (Mkin, et al) Study of Title VII 

Evaluation reports and independent audits of the reports were ex- 
amined for 42 projects in bllljigual education funded under Title VII, The 
authors al&u obtained federal monitors' ratings of the quality of projects 
and questionnaire responses from the project directors. Although the 
authors stress that their data are not stable, with extremely wide confidence 
limltSi a sample of the fmdings is presented. See the 1974 Alkin et al 
report for details* 

rand's Study of Federal Programs Supporting Educational Change 

tmplanentat ion and continuation of programs aimed at inducing edu- 
cational change Iji local school districts were eKamined, These involved 
four federal programs i Title III of ESEA, the Elraentary and Secondary 
Education Act; Right to Read, Vocational Education Act Part D; and ESEA 
Title Vll (bilingual). Two hundred ninety three projects were studied in 



EKLC 



' 3S2 



APPENDIX 3 



7 



18 states, Involving Interviaws with teachers, principalsj program managerss 
superintendents, and faderal progrcm administrators. Twenty-nine projects 
were studied in depthp One hundred projects were reexamined to determine 
whether projects were continued at the end of federal funding. Implementation 
and continuation of projects were affected by- 1) the degree to which the 
innovation matched the local education agencies objectives; 2) the degree 
to which the innovation was consonant with the values of the local education 
agency; amount of change requireds and 3) complexity of the Innovation, 
Sea the report by Berman and McLaughlin for details. 

Datta's Reanalysis of the RAND Study 

Lois-ellin Datta of the National Institute of Education challenged 
the conclusion of RAND's report ^ "Federal Programs Supporting Educational 
Change" that use of outside technical assistance was not effective in 
promoting local school district change. Firsts few characterizations of 
such experts could be found in the reports* Second, a follow-up question- 
naire mdicates greater use of experts than appears in RAND's conclusions. 
All forms of assistances Iccal and outside, were viewed as being not very 
useful by teachers* Outsiders were perceived as being more useful in those 
studies in which program goals were achieved. In most instances, teacher 
participation and involvement was not more strongly related to project success 
than use of outside experts. Datta therefore beiieves that this particular 
conclusion of the RAND report should no longer be cited* See Datta 's 1980 
paper for detail* 

Bissell's Case Study 

Joan Bissell presents four cases in which the California state 
legislature used evaluations In making decisions about the form and funding 
for programs. She also discusses use of basic indicators by the legislatures 
and offers suggestions for Improving use of evaluations by legislatures. 

Florio's Survey of Congressional Staff 

Twenty-six Congressional staff members who dealt with education issues 
were interviewed. Agencies serving Congress were the most frequent source 
of research informations and in general , Washington based sources were 
relied on most* Cost ijiformation was most frequently mnetloned as useful, 
followed by achievement scores. Different types of information were useful 
at different phases of the congressional cycle. Evaluations compete with 
other information for Congressional attention and trust. Details are given 
in a paper by David Florio* 

Mitchell's Study of State Laglslators 

Mitchell interviewed 160 legislators and staff members in the states of 
ArlEonaj California and Oregon about issues in education. Respondents described 
legislation in education in recent years ^ the situation and isaues involved in 
this legislation, and the resources used to influence decision making* Mitchell 
results point to the ii^ortanee of legislative cycles in getting information used 
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Milsap's Case History of ^perlence Baeed Career Education 

A case history la presented of the nanner in which the Scperience 
Based Career Edueatloii program became the priority progrra for fundjjig 
and adoption through Vocational Education Act, Part D, The program and 
its evaluation are described, its process through the Joint DlssCTiination 
R^lew Panel la given, and the process by which regulations were revised 
so that the program would have priority* Milsap raises two issues in this 
case history of research utilisation. The first relates to the lack of 
visibility of regulatlonsi md the degree to which the public becomes 
involved. The second relates to the adequacy and scopf^ of evaluations 
that are tot ended to be us^ for such purposes sad to potffltial issues of 
mlsutllization. See Milsap's 1979 report for details. 



A great deal of the literature bearing on evaluation of federally 
funded programs was reviewed before this Project ras undertaken* Earlier 
work supported by the National Institute of Education was published asi 

Boruoh, Rp F.p and Wortman, P, H* aiplications of educational evaluation 
for evaluation policy. Ini D, C, Berliner (Ed.) Review of Research in 
Education * Washjngtonp D.C.^ toOTican Educational Research Association* 
1979^ pp, 309-363* 

Earlier work supported by the Health Care Finances Administration has been 
issued as: 

Levlton, and Hughes, 1* F, X. Utilization of evaluatlonsi 

A rOTiew and synthesis. Working paper #34, Center for Health Services 

and Policy Research, Northwestern University, Evanston, Illinois, 1980. 

Other reports tocluded to the literature reviewed, and papers which 
have bem abstracted here, are cited in the text of this report. 



Literature Review 
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3.3 Site Visits I Federal AgMcies 

The choice of federal offices to interview personally or by telephone 
was determtoed primarily by whether the office had responsibility to toitiate, 
fund, execute, oi review, evaluations and by the level of evaluation effort. 
User groups were determtoed by the target program for evaluations. The choice 
of Individuals to taterview within the office ras determtare primarily by de- 
termlntog whether they were teiowledgeable about evaluation. 

No c^tral archive of federal evaluation ^perts exists. However, 
organisational charts and normal admtolstrative data on contracts were 
very helpful. For instance, the USOl's Office of Evaluation and Dlssem" 
Ination has regularly produced a useful report on evaluation contracts, 
mounts, contractor, and contractor monitor. The project monitor list 
was a basis for many of the totervlewa within the Office, 

The agOTcles Iji which staff mmbers were Interviewed ^icludedi 
U.S. Office of Education 

Office of Evaluation and DlssCTalnation 

OED Division of llra^^tary and Secondary Progrms 

OED Division of Occupational, Handicapped, Developmental Programs 

OED Office of the Assistant Conraissioner 

OED Division of Educational Replication 

National Institute of Education 

Progran on DissCTlnatlon and ImprovCTrnt of Practice 
Program on Testing, Asseaament, and Evaluation 
Office of the Director 

mm 

Office of the Assistant Secretary for Plannwg and Evaluation 
Office of the Assistant Secretary for Education 

tongresslonal Research Service 
Division of Science Policy 
Division of Education and Public Welfare 

Bureau of Education for the HMdlcapped 

Division of Assistance to States 

U.S. Graeral Account tog Office 



Program Malyaie D^^vlslon 
Htman Resources Division 



EKLC 



385 



APPENDIX 3 



3.4 BiXfi Visits sr*d Telephone SuJ^feys 
Local and Stat© fiduaaticn A^mai^s 

TMs fftetclon ileswibes the ptofi<**lwi:ag ised to Qxccute site visits to 
local and state sSacatton agencitSiJ , tnlcpliotie surveys of the agencies, and 
the rationale foe procedures. 

Introduction 

On?- f.ftld study componeitt of the Project Involved site visits and 
iRZfiWim"-' at 6 state offices of education and 12 school district offices 
n£ ft-tfalu^Htion, Theaa were us&i as a vehicle for case studlesf. Telephone 
snsrvey^* d£ a sample of.ttvetf 40 state «£fice& of education, and of 200 • 
weUuOi diatficts were also undertaken, 

ttm miilti purposes of Bite visits and telephone surveys were co verify 
conclusions aiid Impllcatloiiii of earlier research on the topic, to update 

was toown about local and state views of evaluation, and to assure 
that local and state views were represated la our report. Site visits are 
the basis for qualitative description, i.e., case studies. Telephone surveys 
are- the basis for quantitative description. 

Site Visit Protocol 

The procedure for site visits to LIAs and SEAs tavolved: 

(a) Initial contact with respondents by telephone, 

(b) Follow-up letter to respondKits outllJilng project and 
questions, confirmation of site visit date and visitors, 

(c) Site visit and follow-up contact to clarify ambiguities. 

The telephone contact and letter introduced the study Hi the following ways 

"The Project's main purpose is to review educational program evalu- 
ations supported by the federal goveriment and to make recoranaidatlons 
for their Improvment. The effort is mandated by Congress under Public 
Law 95-561 and Is designed specifically to fumlsh appraisal independent 
of federal agencies and to exMiine evaluation at federal, state, and 
local levels of govemmmt. The Project Is prospective In orientation, 
designed to provide: reconmendations about evaluation policy and 
practice, the evidence to sustain recomendations, and the identification 
of issues and options. Questions to be addressed by the Project Includes 

. . Why piid how are evaluations carried out? 

. Whet are the capabilities of those who carry out evaluations? 

. now are the results of evaluation used? 

, What recomendations can be made to Improvo policy or practice? 

The main vehicles for providing answers to such questions are a critical 
examination of contemporary field research on each topic and field work by 
Project staff. 
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^rrmt res^rch which the Project will reanalyze includes major 
studies by camercial contractors^ school distrlctSs state offices of 
researchp and federal agwcies such as the U.S. Office of Education, 
National Institute of Education^ U.S. General Accounting Office, 

The product^ a report to Congress^ is scheduled for completion 
to June 1980* It will cover prCTiises, recommmdations, issuesj 
options, and evldance bearing on evaluation policy. Inforaation 
generated by the Project will be made available for competing analyses. 
Robert F. Boruch and David S. Cordray, of Northwestern University, 
have prtoary responsibility for direction of the Project and the 
final report. 

The questions we'd like to discuss with your staff are Identical 
to those outlined above i Why are evaluations done? What are capa- 
vilities of evaluators? How are results used* What recoEaamdations 
cm be made? 

The responses to questions will not be attributed to individual 
respondmts, unless the respondent prefers that he or she be Identified 
That iSj individually identifiable responses will be maintained as 
confidential to the extmt pemitted by law." 

Telephone Interview Protocol 

The procedure for telephone interviews Involved* 

(a) Initial investigation to idmtify the evaluation officer 
within school district/state. 

(b) Follow--up letter to respond^ts outlining project and 
questions. 

(c) Telephone interview. 

The :Ltifomatlon to be provided to the respondent in the initial contact 
and in the formal letter is idmtlcal to that given earlier for site visit 
Interviews. 

Selection of Itocal Education Agmcies for Site Visits 

The SOTpltag frame used to select LEAs for site visits ras defined 
as the largest 250 lecal school districts as determined through the 
National Center foir Education Statistics i Education Directory i Public 
School Syst ma 1977--78. The ranks of each school tlie appear on pages 
243-248 of this dociment. In total, 722 school districts are ranked, 
each of these having over 10,000 students trolled during the 1976-77 
school yearg 

The smpltog f rme was restricted to the larges 2S0 schools for the 
following reasons I 
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, We wanted eltes that had a large enough mrollmmt to 
ensure that evaluation office would be to place* 

* We raated sites that have multiple progrMfls that wuld be 
In place* 

Such a restriction regardtag the sanpllng frame ellmajiates from 
consideration raall school districts. The graeral feeling mong the 
Northwestern staff was that we needed to examine evaluation practices 
within organisations (e*g. Offices of Valuation and Researrh) * Small 
districts would probably not have such an organizational iture. Some 

smaller districts have been included in the telephone ph b f the project. 

Sample size for site visits mm restricted eJnce the main purpose is 
to develop case studies and not statistical gmerali^ations. The actual 
sample to be site visited ^ms constructed In the following ways 

1. 250 of the largest school districts were Individually listed 
on 3k5 cards* organized by St^te within Region. 



2, Smple Bizm ms set at 16. 

3, Ustag a random start, every 15th card was selected. This 
process was repeated twicer generating two lists of 16 sites. 

4, The two lists were treated as containtog 16 pairs of sites 

and each pair was again randomly assigned to one of the two lists 

5* The two resulting lists were th^ randomly allocated to primary 
vs. secondary status. List 1 will be the basis for taitial 
contact* In the evmt that we are not pe™itted to visit 
one of the sites listed to the primary list, its paired sites 
from list 2 will be contacted* 

6. Any information about the particular LEA or SEA that we obtatoed 
through toterviews has been tocluded on the attached mmflsration 
of sites. 

7. Comparison of Che s^ple with the distribution of Region and 
size of enrollment for the "population" reveals a closer 
correspondence . 

8. The toformation that OED is compiling for us on amount and 
type of federal funds will be used as background toformation 
111 preparation for the site visits* 

The actual sites are listed to Table 1* 
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Ta! Site Vi 



San Diego Unified School District, 

Broward County, Florida 

St. Louis J Missouri 

St. Paul, Minnesota 

Colorado Springs, Colorado 

Aaarillo, T^cas 

Jersey City, New Jersey 

Springfield, ffessachusetts 
*Lanslng, Miahlgan 
*Gaston County, North Carollr.a 
*Jefferson Parish, la 
Charleston, W. V. 



sion torollment sea 

' 125,463 California 

4 136,000 

7 73,000 

5 38,105 Minnesota 

8 32,452 

6 25,899 Texas 

2 34,698 New Jersey 

1 27,892 Massachusett 

5 28,984 Michigan 
4 34,759 

6 68,851 

3 45,428 



i^^e f "V^ "^"^ education agmcies were unwilling or 

(Sorth C°r"L'ar Ld IL^''"' IT^^y ineluded. Norfolk, (Virginia), Wake Lnty 

SrLSroSekd'desSS?!^' " 

, * '^^a^gtegation* Jr'ertin^t staff menbera at Bol^e Trf^Tiri a f^,,^4.u 
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Sample Selection for Telephone Interviewss Sehool Districts 

The design for this etudy tovolvea a stratified rmdom iample of 200 
loeal sehpol districts. The respondeat in mch case will be the person with 
major responsibility for progrtti evaluation In the school district ^ often 
a director of research or of evaluation. 

The maui stratification variables for school diitricta are (a) sl^e 
of student body md (b) level of federal funding for education programs* 
Population listings contaiitog such Inforniation are available for 1976, We 
do not have the resources to update this listteg bmA must assume that major 
changes are unlikely i the changes in the nature of federal funding of major 
programs over the past 4 years have not bera substantial md this makes the 
as^imption t^able. The sample si^e tos selected to be of min-bmim size on 
the basis of confidmce Intervals for parameters to the aggregate sample. 
In particular, we determine sample sige for proportion smpllng based on 
the assOTption that p ^ .10 for a *95% confidence Jjiterval (acceptable error 
margin of 5X) , This is a simple approach, but we believe it will be sufficient 
to Infora us about gross characteristics of the target group within our ttae 
and budget constratota. 

The sample Itself ^11 be selected by taking every Nth school district 
in a stratified list, beginntag with a random start, N being choara to yield 
a smple size of 200, Replacment will be necessary for those districts 
undergoing transition, and will be taken randomly from the relevant strata 
to the list. 

Construction of Interview Questions 

The construction of totervlew questions was guided by the general 
questions that we requljred to address by our contract. They were elaborated 
in discussion, put into a form suitable for Interview work, pilot tested, 
BXid thm. revised. The questions outlto^ in the orlgtoal work statment 
for the Project for the study werei 

(a) Why and how are evaluations carried out (method and measurements)? 

(1) for whose toformation needs are evaluations carried out? 

(11) what procedures are us^ to carry then out? How sound 
are they? 

(111) what measurra^t tecimlques and devices are used and 

how appropriate are they to the program's objectives?' 

(iv) are there conflicting requlraftmts for different 

progrms that l^d to duplicative or unusually burdmsorae 
efforts? 

(b) What are the capabilities tocluding integrity and Independence 
of thoaa who carry out evaluations? 

(1) where are the evaluation activities located organlMtionally? 



3S0 



(ii) what is the bacl^rounds training and es^erlmce of the 
ataff and of thoaa vto carry out avaluatlons? If such 
aervlcas are provided by outside agencies, how are their 
services obtatoed Md what is the nature of their 
relatlonahip with to-house staff? 

(c) How are the results of evaluation used? 

(1) what ar^ the conditions that facilltata or detract from 
their use? 

(li) when results are used, wlmt Is the nature of the changes 
they Imd to? Can exOTplary use be idratif led? Is 
their nature such that they cmi be adopted In other 
setttogs? 



(4) 



Wtot legislative, regulatory, fundtog or other changes might 
be proposed to Improve the nature, conduct, and utUlty of 
evaluations? 



Content of Questions 

The specific questions addressed In the site visits are given to 
fables 1, 2, and 3 of this Appmdix. 

A nearly Identical set of questions about use of data were developed 
for each type of interviewee^Dlrectors of Research and Evaluation, program 
administrators etc, at the school district md state levels. Questions 
were modified slightly to suit the context. 

To make questions about use memingful, they were constructed to 
refer to specific evaluation products generated for 

. Title I 

• Bilingual Education 

, Vocational Education 
. Special Education 

Questions were further blocked into categories to recognizee 

• Information required by federal reports 

, Data collects in addition to that required by federal reports 

. Foraal evaluations toltlated locally and bearing on each type 
Qf prograai 

• l&eaiplary examples of use 

, Federal and State reports on evaluation* 
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Pilot TMta 

Interview quaetions mixd protoeol for the site visits to loeal and 
state education agracies were developed at Northwestern Ifeiverslty* They 
wtte field tested in Jmtiairy^ 1980 through visits to the Office of Research 
of the Sehool District of Colimbus^ Ohio mnd the Departmmt of Education 
for the State of California, 

Respcndait Burdra 

It is partly to the interest of reducing the todividual reapondmt*s 
burden to the lowest possible level that oral responses^ rather than 
written responaea such as one might make on a quest ionnairej are baing 
elicited. 

It is alao in the Interest of reducing the individual respondwit's 
burd^ to a miikimm. that (a) only the questions implied by the Congress 
were put to respondMts* ^d (b) InforMtlon obtain^ from other surveys 
was used to reduce the need for probe questions« The latter IncludeSp for 
axamplep surveys of research imlts In school districts by the UCLA Center 
for the Study of Evaluations. 

It is in the Interest of reducing respondent burden in the aggregate 
that sample slEe for the telephone survey has hmm reduced to a miniiaiMa 
subject to the constraint that a reasonable level of statistical precision 
is met. 

The same applies to the nmaber of case studies planned. We planned, 
at most, fifteen school districts and tmi state offices of education. No 
formal statistical standards exist for judgtog adeqtiacy of Intmsive case 
studies. But we believe that these provide informative case studies. 
Every effort was made to select sites so as to assure that school districts 
which have participated In relate surveys durtog the past year did not fall 
Into our s^ple. 
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3.5 Site Visits and Telephone 
Interviews: Title I Models 

In order to provide a partial answer to the question "How are evalu- 
ations carried out?,", a special effort ras made to examine the use of 
Title I models at the local and state levels. The main objective was to 
catalog where models are being used, which models are used, and to describe 
the character of use. To assure that we understand the character of use, 
data generated by state and local officials on the bafiis of one model were 
also obtained. The main vehicles for this study were site visits and tele- 
phone interviews. 

The ten Technical Assistance Centers in the Title I program were called 
to obtain information about the use of Title I models. State Title I co- 
ordinators and their evaluation personnel In ten states were also called 
on the same topic. At least twelve local education agencies were also 
contacted for information by telephone. The TAG coverage was complete. 
States were called only when TACs could not provide sufficlen;', information 
or when we chose to pursue information In more detail. Only those local 
education agaicies which use model C were called, based on referrals from 
the Teclmlcal Assistance Centers. 

One site visit was made to the Region I (New England) Technical Assist- 
ance Center. Within Region I, the Providence Board of Bducatlon was visited, 
A second site visit was made to state education offices and three local 
education agencies in Florida on the basis of Florida's diversity in use 
of Title 1 models. 

Complete data on the use of Title I models and testing data more gen- 
erally were obtained from school districts in Providence, Rhode Island; 
Marlon County, Floridaj and Osceola County, Florida. 
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3.6 Roundtable Discussions 



Informal roundtable discussiona were developed to supplment the 
Information g^erated from this itudy's case studies, surveys^ and 
literature reviews. The participants, todlvlduals who are groerally 
taiowledgeable about a particular topic * were invited to meet for dis- 
cussions at Northweatarn, The agenda for each %mm structured to invite 
©pinion and evidence on questions this study wbb asked to address. The 
discussion topics, participants, and staff organisers are described below 

Roundtable on 
Conmunity Reactions to Evaltiation 

On March 3, 1980, six respected and well-informed Evanston school 
parmts discussed evaluation. They were selected based on their long- 
standing InvolvCTsnt with Evanston School District and their influence 
to the community. The school-community leaders who were Invited Include i 

Barbara Elmer - Fresld^ti District 202 School Board 
Alice I^eiman - Presidmt, District 65 School Board 
Sharon Peterson - Mmber, District 65 School Board 
Jessica Feldman - Head^ Comunity-School Volunteer Force and 

Past President, Evanston PTA Council 
Ethel Hllkevltch - Pareit Mvocataj Consultant for Special 

Education Evaluations 
tevis HagemMn - HMd^ School Education, Evanston PTA Council 

and Evaluation Consultant ^ Chicago Board of Education 

District 65 is the Evanston K-8 elafl«itary district. Current mroll- 
ment is 7,071 and the budget is about 20 mllllan dollars. District 202 
Is the Evmston 9-12 high school district* Current enrollment Is 3,821 
with a budget of about i9 million dollars* Stephany Creamer organised 
the roundtable discussion. 



Roundtable on 
^anston School District Evaluation 

Ida Lawler, Director of Testtog and Evaluation at the Evanston School 
District 65^ led a roimdtable discusLiun with staff of this study on 
February 18, 1980. 

Roundtable on Technical 
Assistance and Program ]^aluation 

Laura Crane, of Iducational Testing Service, organized a roundtable 
discussion at ETS. Participants to the February 19, 1980 meeting includecf 
roost staff of the Title I supported Teclmlcal Assistance Center at ETSj 
and Its director p Ted Storlle, 
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Round table on Vocational 
Education Program Evaluation 

Joto GrassOp of the IMivereity of West Virginia^ and former Congress-- 
mmn^ toman ^clnskl were principal speakers at a roundtable discussion 
on evaluattog vocational education programs on Bterch 10, 1980. Janat Weeks 
organised the discussion. 

Roundtable on 
Utilisation of Evaluations 

The Roundtable took place March 14 , at Northwestern, Its main purpose 
\ms to better tmderstand currmt research on utilisation of evaluation 
findings. 

Participants included: 

Dr. Jane David - President, Bay Area Research Associates. Dr^ David 
is the author of a report for ASFE concerning school district uses of 
evaluations and a history of Title I progrMis, 

Dr. Jmaes ^eGracie - Director of Research Office of Research and 
Evaluation^ Mesa Public School Distrlctp Mesa, ^isona. 

Dr. Frieda M. Holley ^ Dir-eetor of Research and Evaluation, Austta 
Public School District j Austin^ TenaSf has studied utilization of 
evaluation to school districts . 

te* Mary Kmnedy - Project Director, the Huron Institute. Dr. Kennedy 

is prtaelpal investigator for a study of utilization of evaluations* 

Dr. K^medy was formerly with the Bureau of the Educationally Handicapped. 

Dr. Lee Sproull - Assistant Professor of Social Science, Department of 
Social Science, Camegie-Mellott University* Dr* Sproull is prtaelpal 
Investigator in a study funded by the Carnegie Corporation of schoul 
admlnistratprs' use of test scores. 

Martin Bulmer - Professor, London School of Economics. 

Hillel Wetaberg - Congressional Staff, Office of Congres^an Benjamin Gilman 

Theodore Storlie - Director, Teclmical Assistance Center ^ Region #5, 
Educational Testtag Service. 

laura Leviton organised the round-table discussion with the assistance 
of Btephany teeamer. 
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Soundtable on Manpowar 

A roundtable focusing en issues relmted to the eapabilitles of sdu- 
cational evaluatore aid the factors toVQlved iM managtog and eonducttog 
evaluation reaaarah ime held May 9th at Northwestera. The purpose of the 
roundtabla mm to elicit es^erts' reactions to findings and to proposed 
racoTOTOdations on evaluation fflrapowar, trainljigp and managoieit of research 
at the federal, state, and local levels. The ^dividual partictpante werei 

Dr. Laimor Garter - Vice Preeld^it, Systeae Developaait Corporation, 
Santa ^tonlca, California* A major contractor for federal evaliiations, 
Dr* terter md his staff have been responsible toT conducting such 
projects as the Longitudinal Evaluation of the^ergOTcy School 
Asslstmce Act Pilot Program, the Evaluation of the Title 1 progrm in 
State tost itut ions for Neglected and Deltaquent Childrm, md the 
Evaluation of the Sustalntag Effects of CompMsatory Iducatlon. 

Dr. Harrison FoK - Visiting Professor, Industrial College, Washington, 
D,C. ta* Fox is co-author of the book Congressional Staff The 
Invisib le Force to Mer lean Lawmak tog which to part focuses on the 
capabilities of congressional staffers and their assoctoted activities. 

Dr. Al^ander Law - Deputy Super totend«t for Research and Evaluation, 
State Department of Education, SacramMto, California* Dr. Law's 
eKpertise is to the area of state educational evaluation and the 
managem^t of these activities, and he is the author of several papers 
on these topics* 

Dr. Len Nactaan - Evaluation Supervisor , Office ©f planning and 
Evaluation, State Departmrnt of Education, St* Paul, Mtonesota* 
Dr. Naclman is currently investigating the practiceof outside 
contracttog at the state level and the developm^t of appropriate 
guldeltoes for the selection of qualified contractors* 

Ms. Laure Sharp - Bureau of Social Sclmce Research^ Washlngtoni DtC, 
Ms* Sharp is coauthor of an early study on the federal contracttog 
process for social program valuation and the more recent project on 
Identifytog the characteristics of those organisations currmtly 
responsible for educational research, developmTOt, dissMlnatlon and 
evaluation. 

te* Carol Tittle - School of Education, University of North Carol toa, 
Greensboro, North Caroltoa. Dr, Tittle has conducted vocational 
education evaluations for the state of New York ^d ia one of the 
co-authors of the Handbook of Vocational Education Evaluation 
published by Sage Press* She is currratly a professor to the master* a 
program to educational evaluation at Greensboro. 



Georgtoe Pion organised the roimdtable with the assistance of 
Stephany Creamer and Luctoa Gallaghar. 
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•3,7 Formal Presentations and Discussions 



In thft time avanable, the Project staff was able to make only a few 
formal pres^tatlons of some results of the study. Our main toterest in 
dotag so ^as to elicit professional crlticlan of the character of results 
and reeomaidatlons. The meetings at which we discussed parts of the 
fudlngs Include: 

1. American Educational Research Association Annual Meeting, 
Task Force on Meta-analysis, Boston, April 8, 1980. * 



2. 



Special National Workshop, Research Methodology and telmlnal 
Justice Progrm Evaluation, Baltimore, March 17, 1980. 



3. National Acadray of Sciences, Conmittee on Program Evaluation. 
Washington, D.C., March, 1980 and Evanston, Illinois, June, 1980, 

4. International Association for Social Science Information 
Service and Teclmology, Annual Conference, Washington. D.C. 
Ktay 2-4, 1980. 
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3.8 Itechijie Based Searches 



National Inetitute of Education MachJjia Based Seareh 

A BmBToh of all grants and contract a bearing on program evaluation 
was initiated uateg the Nil's cOTputer baaed "Program tonagCTent Syetan." 
The BmaToh "mm based on key words eueh as: evaluation^ etudent evaluation, 
effects, effectiveness;^ program evaluation^ achlevCTttitp gains » data bases, 
analysis, Inforaatlon on grant obligations for fiscal y^r 1977 through 
1979 %mm available. We were unaimre of the eKlstmce of the syst^j 
despite Intervlaws with Nil staff , tmtU Ifarch, 1980* We should have to- 
fomad ourselves about it earlier. We are grateful to te, Lois-ellen Datta 
of NIE for her assistance In doing so. 

We attsnpted to classify grants and contracts ustog the categories we 
have used routinely for claasifytog types of evaluation researchi needs, 
assasTOent, process/lmplfflimtation, outcome. Much of the research covered 
two o'f these ar^s simultaneously and a good d^l of it had to be classified 
into a fourth broad category i teolmical research, support, and development. 
So for ^Mple a study at New York's^plre State University of a managem^t 
ttformtion systto ras classified as bearing most directly on process/lm- 
plemantation. Research on developmrat of standards tos classified as teclmical 
research, mipport, and developmmt. A tratotog program for women at San 
Francisco in education was classified as tecteilcal research, support, and 
d^eiopmmt. A study of determinants of career entry at Johns Hopkins was 
classified as ^ploratory/needs and as outcome evaluation research. 

Legal Search 

The LEXIS COTputeriaed legal doctm^t retrieval systm was used to 
search the United States Code for statutes beartog on evaluation. The 
search ^ployed joint occurr^ce of the key words education and evaluation 
as the main criterion for wtmerattog statutes. Scope of the search was 
narrowed furthCT by confining attmtlon to Title 20 of the Code, containing 
statutes relevant to public education. Joe S* Cecil, an attorney with a Ph.D. 
in methodology and evaluation research carried out the tovestigation. His 
report, describing the search strategy, difficulties encountered in using 
LEXIS, and results Is given in an Appendix* 

Educational Resources Information Center (IRIC) 

ERIC and other stallar computerized bibliographic services (NTIS, 
Smithsonian) were used by several project staff primarily to Identify 
important journal articles^ goverament reports, and research activities 
bearing on the Project's objectives. Searches typically focused on subjects 
such as evaluation of vocational education programs or the use of evaluations 
in public policy making, A considerable amount of useful inforaation, par- 
ticularly in the fom of govenmmt reports and documfflits, %mm generated 
in these searches^ Unfortunately, there is no good way for users to estimate 
the coverage of these subject-area SMrches* This is true not only in the 
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bS^fi^a ?f not fotmd by the particular seawh strategy employed 

SaSsL " °' selection procMs for tacludtog docS«its the 

°^ for Identifying and retrieving specific 

e^e. do|- ^^^^ .n^^. 
I^lly- ba^LlllS ~' ^'^^ - P'°"^ whlnh'eould 

in.i I?^* °f selective search can be done in a nunber of different ways 

nu«w > ^ «Ponw'lns or funding agency or federal contract 

number. For example, ■ saB^le of 50 recent evaluation contracts of USOE's 
Office of Evaluation and Dissemination were selected for review. miG was 
alae^J-tin^^Jf the contract number, to Identify and retrieve the reports 

w Y^-^ -^""^ contract. Slightly less than half of the contracts 
dlfL™ °! ERIC citations. However at least a third of these contracts 
ln%«f f ""WT available through ERIC baaed on a listing obtained from USOl. 
ThJ 1 of the. 50 evaluation contracts did have reports catalogued in ERIC. 

This example illustrates both of the coverage problems described ealuer. It 
«v^^J"-^f^® that steps need to be taken to standardise entries within the 
a« SflS^J° ^T^^ 1""" ^^^''^ government reports and documents 
wire e^ecutL hi f^"'!^ ^" Inclusion In the ERIC system. The MIC searches 
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3.9 Evasive and Otherwise Diverting Responees 



1. In response to the question i we have the report? 

• It's upstairs 

, We only printed a few and I don't have any left, 
. I'll s^d It to you first thing In the mernlng. 
« Gee, we don't Issue thCT In hard cover. 

2. In response to the question! How do you use avaluatlan results? 

• Let me tell you about our program. 
. I use thma all the time. 

. Let me tell you about the history of this city. 

3* In response to the quest loni Cm you provide a altatlon to the 
studies you referred to earll^? 

. Let me tell you about the testimony I've heard. 

. It's In the H^rlngSp sometime during 1978--79p In one 
of the 30 odd volimes. 

4. In response to the question! Wio asks for these evaluations? 

. The bureaucrats (If responrat is an eK^^ngresraan) . 

. The Congress (If respondwt Is a bureaucrat). 

. The "fads" (If respondrat Is a local school dlstrlet 
person) . 

5. Program director's reaction to an evaluator'a report that the program 
has little or no decectable Impact! 

. The evaluator Is toeKperlencedi or doesn't understand 
the program # 

. The methodology is wrong i tests are irrelevant. 

. It amam too late to use our decision 

. Let's see if wa can find another evaluator. 

6* Can you provide evldmce that the report was useful? 

. ]^erybody said it was useful. 

. Ask me another question. 

. Iiat me answer ano|her quastion* 
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7. In answer to the questions How do you toow your progrMi is effective 

. Let ne tell you about little Jfary, 
. Let me tell you about little Sue, 

. Let me tell you about Jack who participated In the program 
and succeeded and JUl, his sister, whodld not participate 
wid is in jail. 

8. In response to the questions What do you thiik of bureaucrat Y's. 
efforts to evaluate? ' 

* I decent man, a good friend, and neighbor. 

But he^s been wrong for the past 10 y^rs (If a bureaucrat 
responds) , 

.He's been wrong for the past 10 years (If Congressional 
asslstwt). 

. He's beai right for the past 10 years (if Congressional 
assistant) , 

. Who Is he? (If parent, principal, etc.) 
J. How do you use evaluations? 

. We uas valuations like a drunk uses a lamp post. , . for support 
rather than Ulimlnacion* support 

* You maan we're supposed to use th^? 
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Tabular Comparison 
of Standards for Evaluation 



Soger B. Straw and David S. ,tordray 
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'S® *^'f*^f^ attempt to naka a comparison between six 

sets of atandardB or «lt«la i^lch have be« proposed for J Stag tj^ 
^ l^Lrf The ataadards proposid by the Jolit Slttea 

usef S Sf%'°J Educational Evaluation are listed to col,^ 1 and are 
used as the basis against iriiich to compare the other standards. The 
mta reasons for choosing the Joint Co^lttee's classification arfthat 
2S^1lth''"!"f *«!il^/««iptions and analysis of each standard 
S rL««ii^^^^%?'.«^°^ practice and that each standard can 

he reasonably describe with the indicated descriptor. The descriptor. 

tLlf'Sef af th ''^i' definitions l^Liately f ouS tJI 

to thi J^t cU^lT^^'' five col,™ co^ares another set of standards 
standard " entry. It Indicates that the 

stSdarf ?q"lvalent to the corresponding Joint Comittee 

ol tS t ''^ ^ °° " ***** »° elicit recognition 

to tJ^ atfJf apparent. In col,nttis 3 through 5. the n™*er refers 

aLw v?trthl?.°^*"."'i^ *° °'*8toal doc^ent. The numbers ' 
Thnt.^5^ ? associated content are reproduced following the table. 
The standards used by the Joint Dissemination Review Panel L lude^a 
Sal^^^r^ ^*i"^"« presented to th^ afsuppSrtlo^^' 
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VuanJJb^ Context 
and 

MaJUji Meo^u^emenA 
CQfMwt 

UtiJUty StrndoAds 
Audcenae 

tnioHiMtlon Scope 
and Se£eet<^n 

^^...Mp^ 



UiUmial VaiMUy 
IrOMMZ VatixUty 



ObjtcjtLvAXy 
CMMbAJUty 



hi 



5,1 



11,12 
70 



U6 



5,10,13,14, 

20,21,49,50 

15,17,ni 



16,17,23 

79,20,21,22,24, 
25,26,30,50 

26,32,33,34, 
35,36,37,31 

32,33,34,35, 
36,37 

39,46 

40,47 



6,72 
2,71 



13 41,42,46 
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M,A3,A6,B1, 
M,E1,A6 



A3,A5,Bi 
AS 

m,ci 



A3,PI 

VI, VZ 

M,ki 
V1,V2,V3 



* 

* 



* 



? Report T4m€JUnu4 

TomaJL ObJUQCUUjon 

fuZt oM fmnk 

^ ' ' to lOw uo 

K4ght6 Human 

Human VntmacXAjom 

PolULaal VUUtUy 
Cost E^iieatcv£rt€i4 



PU Pette Kap pa 



Em 


GAP 


47,4i,S4,SB 


V2 


5,6,51 


A7 


5,6,52,54,56 




5,7,9,10,11 


A7,M,m 


6,i 


A7 


41,44 


VI 


6,9,30,48,49 


A7,E2 


6,9,U,29 




29 




45 


VI 


4 ■ 





27 
6 

6,10 



A7 
A7 



ERIC 



Joint Com^ttee on Standards 
tQT Educational Ivaluatloni Draft Standards, 



1978 



Dasertptor 



Daaaribed 
Objects 



DMarlbad 
Cont^t I 



Pssarlbad 
Purpossa and 
Procedures i 

Described 
Information Sources s 



Valid 
Heasur^^At i 



Measurement i 



Data Controli 



Analysis of 
Quantitative 
Infomationi 

Analysis of 
Qualitative 
Informations 

Justified 
Conclusions I 



Daf toitlon 

Thm object of the evaluation (progrMn, project, 
matsrlal) should be descflhed so that it 1^ 
clear what formCs) of the object Is (are) being 
considered In the evaluation. 

The context In irtilch the program, pro j act i or 
materials ^ist(s) should be described in 
enough detail so that th© likely influences of 
the context on the object may be Identified* 

The purposes md procedures of ths evaluation 
should be described to Mough detail so that 
they can be identified and assessed* 

The sources ©f Infomatlon should be described 
li.^ enough detail so that adequacy of the infer- 
m;ition can be assessed. 

Thm information-gathering InstnOTsnts and proce^ 
dures should be chosen or developed and imple- 
mented In ways that will assure that the 
intarpretatlon arrived at Is valU for the 
given use* 

The Information^gatherlng Instruments and proce- 
dures should be chosen or developed and imple- 
mented in ways that will assure that the 
information obtained is reliable* 

The data used In an evaluation should be reviewed 
and corrected so that evaluation reports will 
not be needlessly flawed. 

An evaluation's quantitative Information should 
be appropriately and syst^tlcally analysed so 
that supportable Interpretations are enabled* 

An evaluation's qualitative Information should 
be appropriately and systematically analyzed so 
that supportable Interpretations are enabled. 

The conclusions reached In an evaluation should 
be «pllcltly justified so that the audiences 



»pll< 
can assess 



them. 
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Peacrtpfcor 



DaflnltAop 



Objeetlve 
Rsportlngi 



Ivaluatax' 
Credibility I 



Audience 
Ideatlflcatieai 



Informmtloii 
Scope and 
Seieatloni 



Report 
Clarity I 



Report 
Dlssteinatloni 



Report 
T^iellness i 

Ivaluatlon 
Impact I 



Foraal 
Obligation I 



Bvaluators should be Ijidepradent of what Is 
evaluated, and the evaluation procedures 
should provide safeguards so that the evalua^ 
t Ion findings ^nd reports are not distorted 
by the personal feelings and blasas of any 
party to the evaluation, 

nie persons conducting the evaluation should 
be both trustworthy and competent to perform 
the evaluation so that their findings achieve 
maxteim credibility and acceptance. 

Audiences tavolved In or affected by the 
evaluation should be Identified so that their 
needs can be served* 

Inforaatlon collected should b^ of such scope 
and selected In such ways that pertinent 
questions about the object of the evaluation 
are addressed and so that the Information Is 
responsive to the needs and Interests of 
specified audiences. 

The evaluation report should clearly describe 
the object being evaluated and Its context and 
the purposes, procedures , and findings, so that 
the audiences ^11 readily understand what was 
done, why it was done, what Information was 
obtained, what conclusions were drawn, and 
what recoraiendatlons were made. 

Evaluation findings should be disseminated to 
cllOTts and other rlght^-to^know audiences so 
that they can assess and use the findings. 

Release of reports should be ttoely so that 
audiences can best use the reported Information, 

Evaluations should be planned and conducted in 
ways that encourage follow-through by members 
of the audiences • 

Obligations of the formal parties to an evalua^ 
tlon (what Is done, how, by whom, when) should 
be agreed to in wltlng, so that these parties 
are obligated to adhere to all conditions of 
the agreCTent or formally to renegotiate It, 



407 



6 



Dsflnltion 



Conflict 
of Intaresti 



Full and Frmnk 
Dlgclosuras 



Public's Right 



Rights of 
Hwan Subieatsi 



Conflict of interest should be a^^oided in 
evaluations ot^ should it occur * dealt with 
openly and honestly so that it does not 
compromiie the evaluation processes and 
results « 

nie evaluator's oral and written reports should 
be openp direct^ and honest in their disclosure 
of pertinmt findings t tocluding the lljiitatlons 
of the evaluation. 

The formal parties to an evaluation are respon^ 
slble to uphold the principles of the public's 
rl^t to toowp within the limits of other 
related principles and statutes » such as those 
doling with public safety axid the right to 
privacy, 

Valuators must design and conduct their evalua- 
tions in such ways that the rights and welfare 
of the human subjects are respected and pro- 
tected « 



Himan 
Interactions: 



Balanced 
Reporting I 



. .Fiscal 
Responsibility t 



Practical 
Procedures I 



Political 
Viability: 



Cost 

If f eatlvenefis : 



Valuators should be respectful of human dignity 
and worth in their interactions with other per- 
sons associated with an evaluation. 

Thm evaluation should be equitable in its pre- 
sentation of strengths and weaknesses of the 
object under investigation so that strengths 
can be built upon and problem areas addressed. 

The valuator's allocation and esqsenditure of 
resources should reflect sound accountability 
procedures and should be otherwise prudent and 
ethically responsible. 

The evaluation procedures should be practical 
so as to ensure that associated disruption is 
kept to a minimum I and that findings can be 
obtained . 

Evaluations should be planned and conducted with 
anticipation of the different positions of vari- 
ous interest ^oups so that possible attempts 
by any of these groups to curtail evaluation 
operations or to bias or misapply the results can 
be averted or counteracted, 

Thm evaluation should produce toformtlon of 
sufficient value to justify the resources used. 
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Cantar for the Study of Evaluations 
Standards for Rsvlei/lng Evaluation Reports, 1978 



Standard Nt^ar Standard Contmt 

1 



6 
7 
8 



10 



12 
13 



The progrM or product or other object under study 
In the STOluatlon Is deseribed so that its objec- 
tives are clear, 

Thm program or product or other object under study 
in th^ evaluation is described so that the form of 
Its actual Implraentatlon is clear, 

Thm purposes of the evaluation are described i pur- 
poses my be stated In terms of the evaluation 
questions or objectives. 

Audience (a) for the evaluation Information are 
identified. 

Participants in the educational program and the 
evaluation study ^ and how they were selected for 
partlolpatlonp are described. 

Data collection sources * such as tests, records, 
or ©bservatlon forms p are identified* 

Thm data collection sources are comprehensive enough 
to answer the evaluation questions. 

Hie reliability of the data collection sources, and 
the validity of the data collection sources for the 
purposes Intended Is described* 

Data Malysis procedures are described or ara 
evident (as to detailed tables). 

Evaluation results are described or presented* 



Conclusions or recordations are drawn from the 
results . 



Thm con^uence of the conclusions with the Informal- 
tlon provided is described or evident, 

^e written presMtatlen of i^atever was done In the 
valuation Is clear (even If standards above were 
not met)* 
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Evaluation Research Society 
Draft Standards 



Stmndar d Content 

purposes and characteristics of the program 
or activity to be addressed In the evaluation 
effort should be specified as precisely as possible. 

The decision makers and potential users of the 
evaluation results should be identified and their 
expectations made clear. 

Thm type of evaluation effort required should be 
Identified and its objectives made clear | the 
range of activities to be imdertakTO should be 
specif led. 

An esttaate of the cost of the proposed evaluation 
effort and, ^ere appropriate^ alternatives should 
be provided! this estimate should be prudent, 
ethically responsible , and based on sound accounting 
principles « 

Agrement should be reached at the outset that the 
evaluation Is likely to produce information of 
sufficient Information value , applicability * and 
potential for utilization to justify the resources 
used« 

^e feasibility of midertaking the evaluation effort 
should be esttaated either informally or through 
fortoal evaluablllty assessment. 

Restrictions I if anyp on access to the data and 
results from an evaluation should be clearly estab- 
lished and agreed to between the evaluator and the 
client at the outset* 

Potential conflicts of interest should be identified 
and dealt ^th opraly and honestly* Should the 
possibility of conflict of Interest occur, steps 
should be taken to avoid compromising the evaluation 
processes and results* 

Respect for and protection of the rights and welfare 
of all parties to the evaluation should be a central 
consideration In the negotiation process* 

Accountability for the technical and financial manage- 
mmit of the avaltiatlon once It is undertake ehould be 
clearly defined* 
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standard Cont ent 



All agreements raached in the negotiation phase 
should be specified In writing. Including the 
obligations of all formal parties to the evalua- 
tion and of all parties with specific roles to 
play In the effort. 

Evaluators should not accept obligations that 
exceed their personal qualifications or the 
resources available to thm. 

For all types of evaluations, a clear methodological 
approach or design should be developed and Justified 
in order that rival ecplanatlons and threats to the 
validity of coneluslons and Inferences can be anti- 
cipated. 

For impact studies, the central evaluation design 
problra of estimating the effects of non-treatment, 
and the choice of a particular method for accomplish- 
ing this, should be fully described and justified. 

If sampling is to be used, the details of the sampling 
methodology (choice of unit, method of selection, 
time frame, etc.) should be ecplalned, justified, 
and based on an explicit analysis of the evaluation's 
requirements, including generalization. 

The measurement methods and instruments should be 
specified and described, and their reliability and 
validity estimated for the population or phenomena 
to be measured. 

Professionally outwded or discredited procedures 
and Instriments should not be specified for use. 

The necessary cooperation of program staff, affected 
inBtltutions and m^bers of the comaiunlty, as well 
as those directly Involved In the evaluation, should 
be planned for and assurances obtained. 

A conprehensive data collection and preparation plan 
should be developed in advance of any data collection. 

Given a sound design, the data collection plan should 
conform to it. However, if the development of the 
data collection plan produces Insights Into realities 
that mlg^t compromise the design, appropriate revi- 
sions of the design should be made before proceeding. 
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10 



Stanciard Ki^ar 



Standard Content 



21 



22 



23 



24 



25 



26 
27 



28 



29 



30 



31 



Provision should also be made for the detection, 
reconciliation, and documentation of departures 
from the original desl^ that may occur during 
data collection. 

Evaluation staff should be aelected, trained, and 
supervised according to criteria that ensure com- 
petence, consistency. Impartiality, and ethical 
practice. 

The validity and reliability of data collection 
InstriraentB and procedures should be verified under 
the prevailing clrctmstances of their use. 

There should be an analysis of the sources of error 
that need to be taken Into account, and provisions 
for quality assurance and control should be estab- 
lished that are collectively adequate to meet the 
requlrCTents of the overall design and anticipated 
analyses. 

The data collection Mid coding procedures should pro- 
vide safeguards so that the findings and reports are 
not distorted by the personal feelings and biaaes 
of data collectors. 

Data from secondary sources should be verified. 

Data collection activities should be conducted with 
a minimum of disruption and Imposition In the program 
or other settings where the data collection activities 
take place. 

Procedures that might entail sipiificant adverse 
effects or risks should be subjected to external 
review! if approved, infomed consent should be 
obtained In advance of their application. 

All data collection activities should be conducted 
so that the rights, welfare, dignity, and worth of 
Individuals are respected and protected. 

Data should be handled and stored so that uninten- 
tional release Is prevented and so that access to 
individually identifying data Is as limited as 
possible. 

Docimentatlon should ba provided of the source, * 
method of collection, circimstanees of collection, 
and processes of preparation for each it«B of data. 



ERIC>> V. X 
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Standard Content 

Thm analytic procedures should be matched to the 
general purpoies of the evaluation effort, the 
design j and the data collection. 

Ml analytic procedures, along with their under- 
lying assiraptiona and limitations g should be des- 
cribed explicitly, and the reasons for choosing 
the procedures should be explained. 

Analytic procedures should be appropriate to the 
properties of the measures used and to the quality 
and quantity of the available data, 

mm units of analysis should correspond to the units 
of assignment and comparison. 

No outmoded or discredited analytic procedures should 
be TOployed, 

The malyses should be repllcable by other qualified 
evaluators* 

When quantitative comparisons are made (e.g,, X la 
greater than Y) , some Indication of the confidence 
that can be placed on the stated differences and 
soma indication of their magnitude or consequence 
should be provided. 

Cause-and-effect Interpretations should be bolstered 
not only by reference to the design but also by 
recognition and eltainatlon of plausible rival explan- 
ations ft 

Purported "findings" should be reported in a manner 
that distinguishes fact from speculation and objective 
inference from subjective Interpretation. 

Findings and rscoMendatlons should be presented 
clMrly, completely, and fairly. 
(See standard 40,) 

Findings and recomendatlons should be organized 
and stated In fomm that are conducive to understand- 
ing by the intended audience and address their 
deelslon^maktog requlrOTents, 

Findings and recomendatlons should be presented In 
a frM^rk that tadlcates their relative importance. 
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Standard Contmt 



^sioaptions and llmitatlons sUould be explicitly 
actoswlsdged « 

Those Issues md imMSWered questions that need 
further study should be Identified. 

Complete explanation mi description of how find- 
ings and results were derived should be accessible. 

Persons, groups, and organisations who have made 
contributions to the evaluation effort should receive 
feedback appropriate to their perspectives and 
possible applications. 

Disclosure should follow the legal and proprietary 
understandings agreed upon in advance (standard 7) * 
rtth the evaluator serving as a proponent for the 
fiillestt most opm disclosure possible* 

Policies and proceduras on access to the data 
should be formulated and made available i these 
should specify the officials authorised to release 
data md the criteria for release. 

The finished data bapa and associated documentation 
should be orgmlE^d in a mMner consistent with the 
accessibility policies and procedures* 
(See standards 30, SI, and 33*) 

Evaluation results should be rnde available to 

appropriate users before the relevant decisions 

and discussions about the program must be undertaken. 

Users of evaluation results should be encouraged and 
helped to imdertake further ^lor^^tions of those 
issues and questions not resolved by the valuation 
effort* 

(See standard 45 ») 

Ivaluators should ti^ to anticipate possible misin- 
terpretations and misuse of evaluative Information 
and should provide safeguards to diminish the Improper 
use of evaluative Information* 

Within the Imitations of the Initial understandings 
about disclosure, activities to further utilisation 
should be daslpied with consideration for any broader 
plications that the evaluation nay h^e for 
secondary audiences. 



Standard Nmier 



Standard patent 



The avaluator should brljig to the attention of 
daelsloa makers my toportMt program effects 
aseoclated i^th the evaluation process. 

the utilisation process , evaluators should 
differentiate cl^rly between their activities 
as change agents and their roles as relatively 
impartial scientists* 
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U.S. GAO Bcposure Draft i MmmsmiMg Social Program Impact 
Evaluations s A Checklist Approach, 197S 

Standard Number Standayd Contmt 

Al Have evaluation goals been defined and described? 

A2 Has the evaluabillty of the pragrm been deternLised? 

A3 Has a clear evaluation approach been developed and 

justified I and potential threats to the validity of 
conclusions and Inferencea anticipated and accommo^ 
dated? 

A4 Bbb a method for sample selection been plained and 

justified? 

A5 Have memsuremrat methods been Identified and their 

validity and reliability assessed? 

A6 Have the frequency and ttotag of measuremente been 

specified and sEplalned? 

A7 Has the feasibility of performing the evaluation been 

e%^Qlnsd ? 

AS Has the necessary cooperation been obtained? 

Bl Have procedures for quality control of data bean 

IdMtlfled and Jinplemented? 

B2 Have preliminary analyses been performed to detect 

missing or Inconsistent InformatlDn and correct 
deficiencies In the study plan? 

CI Have the statistical methods and model for use In 

the analysis and the rationale for their selection 
been specified? 

C2 Has the unit of analysis been justified? 

C3 Have the assumptions essential to statistical methods 

and model been specified and have their conditions 
been met? 

Dl Have the findings been presented claarlyi completely 

and fairly? 

D2 Have specific procedures been used to assure the 

report's quality? 
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Standard Nimber Stmdard Content 

D3 tova fallow'-up provisions been made to assist 

decision Bsakers In using the report? 

El Has adequate docimLantstion of the evaluatioii been 

malnta jji^ ? 

E2 Has a procedure been established for release of 

data for audits remalysls and other evaluations 
or research? 
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